\w 


SEMESTER II 


ebra 



Mir 

Publishers 

Moscow 











This textbook directly continues 
the first volume of a course 
of geometry (M. M. Postnikov. 
Lectures in Geometry. Semester 1. 
Analytic Geometry. Moscow, Mir 
Publishers, 1981) based on 
lectures read by the author at 
Moscow University for students 
specializing in mathematics. 

It contains 27 lectures, each a 
nearly exact reproduction of an 
original lecture. 

It treats linear algebra, with 
elementary differential geometry 
of curves and surfaces in three- 
dimensional space added to pave 
the way for further discussions. 





SEMESTER II 




M. M. nOCTHHKOfi 


JIEKUHH no TEOMETPHri 

CEMECTP 2 

JIMHEKHAH AJIFEBPA 
H 

flHOxDEPEHUHAJIbHAH rEOMETPHfl 


MOCKBA «HAyKA» 

FjiaBHaH penanaHa (|)H3HKO-MaTeM aTiraecKoitt 
jiHTepaTypu 



M. POSTNIKOV 

LECTURES 
IN GEOMETRY 

SEMESTER II 

LINEAR 

ALGEBRA 

AND 

DIFFERENTIAL 

GEOMETRY 


Translated from the Russian 
by Vladimir Shokurov 


iMlR PUBLISHERS 
MOSCOW 



First published 1982 

Revised from the 1979 Russian edition 


Ha angAuucKOM nauKti 


© rjiaBHafl peAaKu,HH (J)H3HKO-MaTeMaTnnecKOH jiniepaTypsi 
HSAarejiBCTBa «HayKa», 1979 

© English translation, Mir Publishers, 1982 



V 


PREFACE 


This book is a direct continuation of the author’s previous 
book* and is akin to it in being a nearly faithful record of 
the lectures delivered by the author in the second semester 
of the first year at the Mathematics-Mechanics Faculty of 
Moscow State University named after M. V. Lomonosov 
to mathematical students (a course in Linear Algebra and 
Analytic Geometry). Naturally, in the selection of the mate¬ 
rial and the order of presentation the author was guided by 
the same considerations as in the first semester (see the 
Preface in [1]). The number of lectures in the book is ex¬ 
plained by the fact that although the curriculum assigns 
32 lectures to the course, in practice it is impossible to 
deliver more than 27 lectures. 

The course in Linear Algebra and Analytic Geometry is 
just a part of a single two-year course in geometry, and 
much in this book is accounted for, as regards the choice 
of the material and its accentuation, by orientation to the 
second year devoted to the differential geometry of mani¬ 
folds. In particular, it has proved possible (although it is 
not envisaged by the curriculum) to transfer part of the 
propaedeutic material of the third semester (the elementary 
differential geometry of curves and surfaces in three-dimen¬ 
sional space) to the second-semester course and this has 


* M. M. Postnikov. Lectures in Geometry: Semester 1. Analytic Geo¬ 
metry. Moscow, Nauka Publishers, 1979 (English translation, Mi^ 
Publishers, Moscow, 1981, referred to as 1 in what follows). 


6 


^relace 


substantially facilitated (not only for the lecturer but, what 
is of course more important, also for the students) the third 
semester course. At the same time, as experience has shown, 
this material appeals to the students and they learn it well 
on the whole already in the second semester. 


October 27, 1977 


Af. M, Postnikov 













CONTENTS 


Preface 5 

Lecture 1 11 

Vector spaces. Subspaces. Intersection of subspaces. Linear 
spans. A sum of subspaces. The dimension of a subspace. The 
dimension of a sum of subspaces. The dimension of a linear 
span 

Lecture 2 19 

Matrix rank theorem. The rank of a matrix product. The 
Kronecker-Capelli theorem. Solution of systems of linear 
equations 

Lecture 3 28 

Direct sums of subspaces. Decomposition of a space as a direct 
sum of subspaces. Factor spaces. Homomorphisms of vector 
spaces. Direct sums of spaces 

Lecture 4 36 

The conjugate space. Dual spaces. A second conjugate space. 

The transformation of a conjugate basis and of the coordinates 
of covectors. Annulets. The space of solutions of a system 
of homogeneous linear equations 

Lecture 5 47 

An annulet of an annulet and annulets of direct summands. 
Bilinear functionals and bilinear forms. Bilinear func¬ 
tionals in a conjugate space. Mixed bilinear functionals. 
Tensors 

Lecture 6 58 

Multiplication of tensors. The basis of a space of tensors. Con¬ 
traction of tensors. The rank space of a multilinear func- 
tionEfl 





8 


Contents 


Lecture 7 64 

The rank of a multilinear functional. Functionals and per¬ 
mutations. Alternation 

Lecture 8 72 

Skew-symmetric multilinear functionals. External multipli¬ 
cation. Grassman algebra. External sums of covectors. Ex¬ 
pansion of skew-symmetric functionals with respect to the 
external products of co vectors of a basis _ 

Lecture 9 82 

The basis of a space of skew-symmetric functionals. Formulas 
for the transformation of the basis of that space. Multi vec¬ 
tors. The external rank of a skew-symmetric functional. 
Multivector rank theorem. Conditions for the equality of 
multi vectors 

Lecture 10 92 

Cartan’s divisibility theorem. Pliicker relations. The Plii- 
cker coordinates of subspaces. Planes in an affine space. 

Planes in a projective space and their coordinates 

Lecture 11 106 

Symmetric and skew-symmetric bilinear functionals. A mat¬ 
rix of symmetric bilinear functionals. The rank of a bilin¬ 
ear functional. Quadratic functionals and quadratic forms. 
Lagrange theorem ' 

Lecture 12 118 

Jacobi theorem. Quadratic forms over the fields of complex 
and real numbers. The law of inertia. Positively definite qua¬ 
dratic functionals and forms 

Lecture 13 127 

Second degree hypersurfaces of an n-dimensional projective 
space. Second degree hypersurfaces in a complex and a real- 
complex projective space. Second degree hypersurfaces of an 
n-dimensional affine space. Second degree nypersurfaces in 
a complex and a real-complex affine space 

Lecture 14 140 

The algebra of linear operators. Operators and mixed bili¬ 
near functionals. Linear operators and matrices. Invertifilp 











Contents 


9 


operators. The adjoint operator. The Fredholm alternative. 
Invariant subspaces and induced operators 

Lecture 15 151 

Eigenvalues. Characteristic roots. Diagonalizable opera¬ 
tors. Operators with simple spectrum. The existence of a 
basis in which the matrix of an operator is triangular. Nilpo- 
tent operators 

Lecture 16 160 

Decomposition of a nilpotent operator as a direct sum of 
cyclic operators. Root subspaces. Normal Jordan form. The 
Hamilton-Cayley theorem 

Lecture 17 170 

Complexification of a linear operator. Proper subspaces belong¬ 
ing te characteristic roots. Operators whose complexifica¬ 
tion is diagonalizable 

Lecture 18 179 

Euclidean and unitary spaces. Orthogonal complements. 

The identification of vectors and covectors. Annulets and 
orthogonal complements. Bilinear functionals and linear ope¬ 
rators. Elimination of arbitrariness in the identification of 
tensors of different types. The metric tensor. Lowering and 
raising of indices 

Lecture 19 191 

Adjoint operators. Self-adjoint operators. Skew-symmetric 
and skew-Hermitian operators. Analogy between Hermitian 
operators and real numners. Spectral properties of self-adjoint 
operators. The orthogonal diagonalizability of self-adjoint 
operators 

Lecture 20 199 

Bringing quadratic forms into canonical form by orthogonal 
transformation of variables. Second degree hypersurfaces in 
a Euclidean point space. The minimax property of eigen¬ 
values of self-adjoint operators. Orthogonally diagonalizable 
operators 

Lecture 21 208 

Positive operators. Isometric operators. Unitary matrices. 

Polar factorization of invertible operators. A geometrical 
interpretation of polar factorization. Parallel translations 



Confenfs 


10 


and centroaffine transformations. Bringing a unitary opera¬ 
tor into diagonal form. A rotation of an n-dimensional Eucli¬ 
dean space as a composition of rotations in two-dimensional 
planes 

Lecture 22 221 

Smooth functions. Smooth hypersurfaces. Gradient. Deriva¬ 
tives with respect to a vector. Vector fields. Singular points 
of a vector field. A module of vector fields. Potential and ir- 
rotational vector fields. The rotation of a vector field. The 
divergence of a vector field. Vector analysis. Hamilton’s 
symbolic vector. Formulas for products. Compositions of 
operators 

Lecture 23 243 

Continuous, smooth, and regular curves. Equivalent curves. 
Regular curves in the plane and graphs of functions. The 
tangential hyperplane of a hypersurface. The length of a 
curve. Curves in the plane. Curves in a three-dimensional space 

Lecture 24 262 

Projections of a curve onto the coordinate planes of the mov¬ 
ing /i-hedron. Frenet’s formulas for a curve in the /i-dimen- 
sional space. Representation of a curve by its curvatures. 
Regular surfaces. Examples of surfaces 

Lecture 25 276 

Vectors tangential to a surface. The tangential plane. The first 
quadratic form of a surface. Mensuration of lengths and an¬ 
gles on a surface. Diffeomorphisms of surfaces. Isometries and 
the intrinsic geometry of a surface. Examples. Developables 

Lecture 26 291 

The tangential plane and the normal vector. The curvature 
of a normal section. The second quadratic form of a surface. 

The indicatrix of Dupin. Principal curvatures. The second 
quadratic form of a graph. Ruled surfaces of zero curvature. 
Surfaces of revolution 


Lecture 27 310 

Weingarten’s derivation formulas. Coefficients of connec¬ 
tion. The Gauss theorem. The necessary and sufficient con¬ 
ditions of isometry 


Subject Index 









Lecture I 


Vector spaces • Subspaces • Intersection of subspaces • Linear 
spans • A sum of subspaces • The dimension of a subspace • The 
dimension of a sum of subspaces • The dimension of a linear 
span 


In this semester we shall transfer the results obtained in the 
first semester to the case of any n. In the main we shall follow 
the same plan of presentation as before. 

Recall (see Definition 1 in Lecture 1 of [1]) that a vector 
(or linear) space over a field K is a set whose members 
are called vectors and where the operation of addition x, 
y 1 -^ X + y and the operation kx of multiplication 

by any number fc 6 K are defined. It is also required that 
under addition f" should be an Abelian group and that 
four natural axioms should hold for multiplication by 
numbers in K. 

The concepts of a linear combination of vectors and of 
linearly dependent or independent families and sets of 
vectors have meaning in such a space. A space T is said 
to be finite-dimensional if there exists in it a finite basis, 
i.e. a family of vectors in terms of which any vector of 5^ 
can be linearly expressed in a unique way. The number of 
vectors is the same in all the bases. It is called the dimen¬ 
sion of the vector space and designated by the symbol 
dim T. 

Let T' be an arbitrary finite-dimensional vector space. 

Definition 1. A subset ^ of a space TT is said to be its 
subspace if every linear combination k^x^ + . . . + kj^x^ 
pf any vectors Xi, . . ., x^ 6 belongs to 



12 


Semester 2 




It is obvious that is a subspace if and only if x + y 6 ^ 
and fcx 6 (9^ for any vectors x, y G ^ and any number 
fc 6 K. 

In other words, the fact that # is a subspace means that 
the correspondences x, y x + y and x -^A^x, where x, 
y G ^ and G 'K? define some operations in 0^. It is clear 
that under these operations the subspace is a vector space, □ 

Examples of subspaces 

1. In any vector space 5^ the one-member subset {0} 
and the whole set ‘T are subspaces. The subspace {0} (ordin¬ 
arily denoted simply by 0) is called zero and the subspace 
T is called trivial, 

2. In the vector space for any m ^ n the totality of 
all vectors of the form {x^, • . ., X , 0, . . 0), whose 
last n-m coordinates are zero, is a subspace. This sub¬ 
space is isomorphic in a natural way to the space 

3. In a vector space of polynomials (or, more generally, 
that of any functions satisfying certain conditions) a sub¬ 
space is the set of all polynomials (functions) equal to zero 
at one or several fixed points. 

4. A subspace is a set of all polynomials whose coeffi¬ 
cients are zero for given fixed degrees, as well as a set of all 
even or all odd polynomials. 

Proposition 1. The intersection 

a 

of an arbitrary family of subspaces 0^^ ^ ^ ^ subspace. 

Proof, If X, y G then x, y G for any a and there¬ 
fore X + y G G and hence (since a is arbitrary) 

X + y G <9^, /cx G □ 

Note that an intersection of subspaces cannot be empty 
since any subspace contains a zero vector 0. 

If n 6! = 0^ then the subspaces ^ and (S are said 
to be disjoint. 

In spite of its simplicity Proposition 1 leads to important 
consequences. 

Let S be an arbitrary subset of a vector space 

Definition 2. A subspace 0^ a is said to be the {linear) 
span of a set S ii S a 0 and 0 is the smallest subspace 






LecKire i 


IS 

possessing this property, i.e. if every subspace 6! for which 
S cz a contains The span of a set S is designated by the 
symbol IS]. It is also called a subspace generated by the 
set S. 

Proposition 2. There exists a span [5] for any set Scz.Tr. 

It is the intersection of all subspaces containing S. 

Proof. Since every subspace zd 5 participates in this 
intersection (which is a subspace, according to Proposi¬ 
tion 1), it is contained in (g. On the other hand, it obviously 
contains S. □ 

In connection with this proof the question arises: have 
we any right in general to speak of the intersection of sub¬ 
spaces containing S? Why, strictly speaking, do such sub¬ 
spaces exist? The formal answer is that in accordance with 
the general principles of set theory the intersection of a fam¬ 
ily of subsets of an arbitrary set T' is well-defined even 
when the family is empty and is in this case, however 
paradoxical it may be, the whole of But in our particular 
case the situation is still simpler, because the family consid¬ 
ered is never empty. Indeed it is trivial that one of the 
subspaces containing S is the whole space T'. 

A more visual description of a span [S] is given by the 
following proposition: 

Proposition 3. The span [S] of a set S consists of all possible 
linear combinations 

(1) ^1^1 “h • • • T" hjYiXjYi^ Xj, • • • j XjYi 6 ^y • • • y hfn ^ K, 

of the vectors of S. 

Proof. If ^ is a subspaee containing 5, then it obviously 
contains all vectors of the form (1). On the other hand, it is 
clear that the totality of all vectors (1) is a subspace con¬ 
taining S. □ 

It follows from this proposition that the set of vectors of a 
space TT is complete if and only if it generates the whole of TT, □ 
Recall (see Lecture 12 in [1]) that two sets of vectors are 
said to be linearly equivalent if each vector of either of the 
sets can be linearly expressed in terms of the vectors of the 
other set. It is clear that this is equivalent to saying that 
a vector is a linear combination of vectors of one set if and 
only if it is a linear combination of vectors of the other set. 



^emesfer i 


U 

i.e, according to Proposition 3, to saying that the spans oj 
both sets coincide (both sets generate the same subspace). 

Unlike the intersection the union of subspaces is not in 
general a subspace. To obtain a subspace it is necessary to 
pass from the union to its linear span. 

Definition 3. A sum ^ of an arbitrary family of 

a 

subspaces cz T* is the span of their union: 

S^a = [U 

a a 

For two subspaces and (g 

u ($]. 

It is clear that any linear combination of the vectors of 
(9^ U 6! has the form x + y, where x 6 (9^, y 6 This 
proves the following proposition: 

Proposition 4. A sum ^ of the subspaces ^ and 
consists of all possible vectors of the form x + y» where x ^ (9^, 

y 6 □ 

A similar proposition holds also of course for the sum 
of any family of subspaces. 

Thus far we have not used in any way the assumption 
about the finite dimensionality of the vector space 5^. We 
shall now consider the questions where this assumption is 
essential. 

Let n = dim 

Proposition 5. For the dimension dim oT* of an arbitrary 
subspace cz the inequality 

dim ^ ^ n 


is correct. 

One may hear from students and read in some text¬ 
books the following reasoning supposedly proving Proposi¬ 
tion 5: any w -f 1 vectors of a subspace being vectors of 
an w-dimensional space are linearly dependent; therefore 
the subspace (9^ cannot contain more than n linearly inde¬ 
pendent vectors and so dim ^ ^ n. 


4 











Lecture i 


The inadequacy of this reasoning lies in the fact that it 
presupposes the finite dimensionality of the space 5^. As 
a matter of fact it only proves that if there is a basis in 3^ 
that basis contains no more than n elements. We have 
therefore to use another, more complicated way of reasoning 
to prove Proposition 5. 

Proof of Proposition 5. If = 0, there is nothing to 
prove. If (fP = 7 «^= 0, then there is a nonzero vector ei G 
If (9^ = [e^], then ei is obviously a basis in oP and therefore 
dim = 1 . li ^ ^ [ej], then there is a vector eg in 
that is not linearly expressible in terms of ei, i.e. such that 
the vectors Ci, eg are linearly independent. U = [ei, egl, 
then 01 , eg is a basis in and hence dim = 2. But if 

^ ¥= [© 1 ? Cg], then there is a vector 63 in which is not 

linearly expressible in terms of the vectors ei, eg and so on. 
Since dim V* = w, this process must be over not later than 
the vector e^ appears. Consequently, the subspace ^ is 
finite-dimensional and dim # ^ w. □ 

If dim Qp^ == then any basis in SP, being a linearly 
independent family consisting of n vectors, is a basis in T 
as well. Therefore ^ = 5 ^. But if dim (9^ <[ w, then the 
basis in having fewer than n vectors, cannot be a com¬ 
plete family in TT and hence does not generate T• Therefore 
^ ^ f". Thus a subspace cz TT coincides with TT if and 
only if dim SP^ = dim TT, □ 

Theorem 1 (dimension of a sum of subspaces). For any 
two subspaces and ^ the formula 

dim ((9^ -|- g) = dim ^ 4 - dim (S — dim fl Q) 

is correct. 

Proof. Let 

dim ^ = p, dim (2 = dim {^ (] (S) = r. 

Consider in fl 6 ! an arbitrary basis Ci, . . ., e^.. Adding 
to this basis vector after vector we finally obtain some basis 

(2) e^, . . ., e^., fj, . . ., fp-r 

of the subspace SP zd SP f] fi. Similarly in the subspace (2 
we can construct a basis of the form 

(3) 


Cl, • • •, C;., gl, • . gg—r* 




16 


iemesfer 2 


Theorem 1 is obviously proved if we show that p “h ^ ^ 

vectors 

(4) Bj.j 1^5 . . • m •f 

form the basis of the subspace <3^ + 

Linear independence. Let 

~f" • * • “1" kj-Bj. -|- Zjfi Ip^r^p^r “h 

+ ~ 1 “ • • • “ 1 " ^2-r^q-r ~ 

Setting 

e — -j" • • • “h 

£ = Zifj + • • • + lp--Ap-r 

g = ^igl + . . • + rriq^rSq^r 

we obtain vectors and g 6 ® such 

that e + f + g == 0. Then e + f = (^ and therefore g = 

= ~(e + £) 6 (9^. Hence g 6 ^ fl 61 and consequently the 

vector g can be linearly expressed in terms of the vectors 
ei, . . e^. But under the hypothesis the vector g can be 

linearly expressed in terms of the vectors gi, . . gp-r* 
Since there can be no two distinct expressions for the same 
vector in terms of the basis (3) this proves that both expres¬ 
sions have zero coefficients. Thus nii = 0, . . iriq^j. = 0 
and hence g = 0. 

But then e + f = 0 and consequently (since (2) is a basis) 
Zci = 0, . . fc, = 0, Zi = 0, . . Ip^r = 0. This proves 
that the vectors (4) are linearly independent. 

Completeness. Any vector in (9^ -h 61 is» as we know, of 
the form x + y where x 6 y € 61* On adding the expan¬ 
sion of the vector x with respect to the basis (2) to the expan¬ 
sion of the vector y with respect to the basis (3) we obviously 
obtain a respresentation of the vector x + y a linear 
combination of the vectors (4). Consequently the family (4) 
of vectors of the subspace 5^ + 61 is complete. 

Being linearly independent and complete, the family (4) 

is a basis.□ , 

Corollary 1. // + (S = F, then dim (1?^ fl &) P + 

“h Q — ^ 

Corollary 2. // p q> then fl 61 0. 



Lecture 1 


17 


How can the dimension of a subspace be computed? The 
answer to this question depends of course on the way the 
subspace is given. Therefore we shall return to this question 
every time we come across a new way of giving subspaces. 
But at present we actually know one method of effective 
representation of subspaces, that of representing them as 
the linear span of a certain finite set of vectors. Therefore 
our general question can be stated concretely as a problem 
in computing the dimension dim IS] of the span of an 
arbitrary (finite) set of vectors S, It is this problem that we 
shall now discuss. 

Let S be an arbitrary finite set of vectors. We may assume 
without loss of generality that it contains nonzero vectors 
and consequently possesses linearly independent subsets. 
By finiteness of the number of vectors in S there are among 
these subsets maximal ones, i.e. such that joining to them 
any other vector of S turns them into linearly dependent 
sets. Since this is possible if and only if the vector to be 
joined is linearly expressible in terms of the vectors of the 
subset we deduce that any maximal linearly independent 
subset Sq of the set S is linearly equivalent to the whole set 5, 
i.e. (see above) generates the same subspace [S], This means 
that a set Sq is complete in [S] and since it is, in addition, 
linearly independent it follows that after an arbitrary num¬ 
bering it becomes a basis in [5]. So every maximal linearly 
independent subfamily of the set S is a basis of the span [5] 
of the set S. 

Since all bases of any space consist of the same number 
of vectors it follows in particular that all maximal linearly 
independent subsets of the set S consist of the same number of 
vectors. 

Definition 4. The number of vectors of a maximal linearly 
independent subset of a set S is called the rank of the set S. 

According to what has just been said this definition is 
correct. 

In addition we see that the following proposition is true: 

Proposition 6. The dimension dim [S] of the span of a set 
of vectors S is equal to the rank of that set. □ 

On the face of it this proposition seems a vacuous tautolo¬ 
gy. In fact it has a very deep content since it identifies the 
number dim [S] we are interested in with a certain number 


2-01325 



18 


Semester i 


(rank) for which there is, at least in principle, a possibility 
of being computed in a finite number of steps estimated in 
advance, i.e. such that is said to be effectively computable. 
Indeed, to compute the rank it is possible for example to 
look over all the subsets of the set S (there are a finite 
number of them!) and to determine for each subset whether 
it is linearly independent (which also takes a finite number 
of steps). Thus the significance of Proposition 6 lies in the 
fact that it indicates a finite procedure for computing the 
dimension of subspaces (when, we stress, the subspaces are 
given as the spans of finite—-it is obligatory for effective¬ 
ness!—sets of vectors). 

Of course the size of the required computation can be 
substantially reduced by arranging it in a reasonable way. 
The appropriate procedures will be dealt with in the next 
lecture. 








Lecture 2 


Matrix rank theorem • The rank of a matrix product • The 
Kronecker-Capelli theorem • Solution of systems of linear 
equations 


The answer to the question about the rational method for 
computing the rank of a set of vectors put at the end of the 
preceding lecture naturally depends on the way of giving 
these vectors. We shall consider only one but most impor¬ 
tant variant where vectors are given by their coordinates in 
a certain basis. This is equivalent to assuming that our 
vectors lie in the space of row vectors K”. 

So let us be given m vectors 

ai = • • • 7 ^in) 

(1) • ‘J. 

— (^ml7 • • • • ^mn) 

of the space Arranging the components of the vectors 
in the form of a rectangular matrix 

/ ^11 • • • ^In 

( 2 ) ^=1 . 

' ^'ml • * • ^mn 

we can restate the problem we are interested in in the fol¬ 
lowing final form: 

Given a rectangular matrix (2). What is the rank of the 
set of its rows? 

It is in this form that we shall solve the problem. 

Let 1 ^ jP ^ min {m, n). On choosing in the matrix A 
arbitrarily p rows and p columns and considering the ele¬ 
ments in their intersection we obtain a square “submatrix” 


2 * 








20 


Semester 2 


having p rows and p columns. The determinants of such 
matrices are called the minor determinants or minors of order p 
of the matrix A. 

Definition 1. The highest order of nonzero minors, i.e. 
a number p such that there is no nonzero minor of order 
p + 1 in the matrix A but there is a nonzero minor of 
order p, is called the rank of the matrix A. 

Note that if all minors of order p + 1 are zero then so 
are all minors of order p + 2 since by the formula for the 
expansion of determinants any such minor is a linear combi¬ 
nation of minors of order p + 1. Also zero of course are 
all minors of higher order. 

It is clear that the rank p of a matrix (2) satisfies the 
inequalities 

with p = 0 if and only if all elements of the matrix are 
zero. 

Looking over minors of higher and higher order we can 
always compute the rank of an arbitrary matrix in a finite 
number of steps. Therefore the answer to the question put 
above is given by the following theorem: 

Theorem 1 (rank of a matrix). The rank of an arbitrary 
matrix is equal to the rank of the set of its rows. 

Proof. Note first that in any interchange of the rows of 
columns of a matrix A the set of all of its minors of each 
order is bijectively mapped onto the set of the minors of 
the same order of the transformed matrix, nonzero minors 
becoming nonzero minors. Consequently, in every such 
interchange the rank p of the matrix remains unchanged. 

What happens to the rank of the rows? It is clear that it | 
remains unchanged on interchange of the rows. As to inter¬ 
changing the columns it reduces to a simultaneous redesigna¬ 
tion of the components of all vectors (1), which leaves all 
linear relations between these vectors (or between some of 
them) obviously unchanged. Therefore the rank of the set of 
the rows of the matrix A also remains unaltered on any 
interchange of the columns. 

Since by interchanging the rows and columns we can 
have a nonzero minor of order p of the matrix A occupy 
the top left corner it follows that in proving the equation 








Lecture 2 


21 


p = r we may assume without loss of generality that 



flip 

flpp 


^ 0 . 


If the first p rows of the matrix A were now linearly de¬ 
pendent, then the rows of the determinant A would obviously 
turn out to be linearly dependent too and so the determinant 
would be zero. This proves that the rows ai, . . Op of the 
matrix A are linearly independent and consequently p ^r. 
To prove the equation p = r it is therefore sufficient to 
establish that any row a^-, with i > p, is linearly expressible 
in terms of the rows ai, . . ap. 

To this end consider the following determinant of order 

p -f- 1: 

a^^ . •. u^p a^j 

a^i • • • fl2p flzy 

(3) ., 

flpl . . . flpp Opj 

flji • • • flip a^ij 


where 1 ^ ^ w. If 1 ^ ^ p, then the determinant (3) 
has two identical columns and is therefore zero. But if 
p + 1 ^ 7 ^ then the determinant (3) is the minor 
of the matrix A of order p + 1 (resulting from the choice 
of the first p rows and columns besides the /th column and 
ith row) and therefore also zero. Consequently, on expand¬ 
ing this determinant by the last column we obtain for 
any 7 = 1 , . . ., n an equation of the form 

(4) + • • • + ApOpj -j- Afl^y = 0, 


where A^, A ...,4p, A are algebraic complements of 
that column. These depend only on the elements in the 
first p columns of the determinant (3) and are in particular 
the same for all /. In vector notation therefore n equations 
(4) are equivalent to one equation of the form 

• • • "h Ap9ip + Aa^ = 0. 


Since under the hypothesis A = 7 ^ 0 this proves that the 
vector a^, p + 1 ^ j fl, is linearly expressible in terms 
of the vectors ai, . . ., ap. Consequently r = p. Q 






22 


Semester 2 


The proof above shows in particular that if the matrix A 
has a nonzero minor of order p possessing the property that 
all minors of order p + 1 ^^bordering*' it are zero, then the rank 
of the matrix is equal to p. □ 

This remark significantly simplifies of course computing 
the rank. 

In the particular case where the matrix A is square and 
its rank is equal to its order we obtain the following. 

Corollary. A determinant is nonzero if and only if its rows 
are linearly independent. 

It is clear that in transposing a matrix A the rank p 
remains unchanged. At the same time the rank of the rows 
in the transposed matrix is equal to the rank of the columns 
in the original matrix. This proves that the rank of the set 
of rows in an arbitrary matrix is equal to the rank of the set of 
its columns, □ 

A wonderful result relating the ranks of the families 
of vectors in two vector spaces having, generally speaking, 
even different dimensions! 

What happens to the rank under matrix multiplication? 

Let be a matrix having (as above) n columns and m 
rows and B a matrix having n rows and s columns. Then a 
matrix AB is defined having m rows and s columns, li r {A) 
is the rank of the matrix A and r {B) is the rank of the 
matrix B, then what can be said about the rank r {AB) 
of the matrix ABl 

It turns out that in the general case one can only say 
that the rank r {AB) does not exceed the lower of the ranks 
r {A) and r {B): 

Proposition 1. The inequalities 

r {AB) < r (A), r {AB) < r {B) 

hold 

Proof, Let 



1 
















Lecture 2 


23 


By the definition of matrix multiplication 

n 

^ik ZJ jk^ ^ 1, . . . , 7?l, k = 1, • . . , S. 

J=i 

We introduce into consideration the row vectors of matrices 
B and C: 


bi (^ii» * * *? ^is)i (^11? * • *» ^is)? 


(^nl? 






Then the formulas for can be rewritten in the following 
form: 


n 

Cf ^^1 i =- 1, • • *) ni 

j=i 

denoting that the vectors Ci, . . are linearly expres¬ 
sible in terms of the vectors b^, . . b^. Hence 

[cij • • ^—- tbj^ • • ay bj],] 


and therefore 

dim [ci, . . Cm] ^ dim [bi, . . b^], 

i.e. by the matrix rank theorem r (AB) ^ r (B). 

The inequality r (AB) ^ r (A) can be proved in a similar 
way (we should only consider columns instead of rows). 
It is possible, however, to derive it from the inequality 
already proved if we take advantage of the fact that trans¬ 
posing leaves the rank unchanged and that (AB)'^ =B'^A'^, 
Indeed, 

In the case where one of matrices ^4 or 5 is square and non¬ 
singular it is possible to prove a more precise result: 

Proposition 2. If B is a square (n = s) and nonsingular 
(det 5 0) matrix^ then for any matrix A 


r (AB) = r (4). 





24 


Semester 2 


Similarly^ if A is a square {n ~ m) and nonsingular (det A 
^ 0) matrix^ then for any matrix B 

r {AB) = r (B). 

In shorty multiplying by a nonsingular matrix leaves the 
matrix rank unchanged. 

Proof, For a nonsingular matrix B there exists an inverse 
matrix 5“^ and A = (AB) B~^. Therefore, according to 
Proposition 1, 

r{A) = r {{AB) B^^) < r {AB), 

Consequently r (^4) = r {AB), The equation r {B) = 
:= r {AB) for a nonsingular matrix A can be proved in 
a similar way. □ 

The theorem on the matrix rank allows us not only to 
compute effectively ranks and to find maximal linearly 
independent subsets but also helps for example to determine 
whether a given vector b can be expressed in terms of given 
vectors ai, . . without having to find in explicit 

form the coefficients of linear dependence. 

It is indeed obvious that the vector b can be linearly ex¬ 
pressed in terms of the vectors ai, . . am if and only if each 
maximal linearly independent subset of the set ai, . . am 
is also a maximal linearly independent subset of the extend¬ 
ed set ai, . . a,^, b and hence if the rank of the set ai, . . . 

. . am is equal to the rank of the set a^, . . a^^, b. □ 

It is useful to restate this fact in terms of the theory of 
linear equations. If 

ai = (flu, • • •» flin)» 

“ (^ml» * • *? ^mn)> 
b = (6i, • • •? ^n)? 

then the vector equation 

(5) 4“ • • • “h “ 1^ 
is equivalent to n numerical equations 

4" • • • 4" ^ml^m — ^1? 

( 6 ) . 

4" * • * 4" ^mn^m 














Lecture 2 


25 


Relations (6) form a system of n nonhomogeneous linear 
equations in m unknowns. This system is compatible^ i.e. 
has at least one solution Xm if and only if equation 

(5) holds, i.e. if the vector b is linearly expressible in terms 
of the vectors ai, . . ., a,„. 

On the other hand, by Theorem 1 the rank of the set of 
vectors ai, . . ., a,^ is equal to the rank of the matrix of 
the coefficients 

( ^11 • • • ^ml\ 

. 1 

a^ji ... ajYin' 

of system (6) and the rank of the set of vectors ai, . . . 

. . ., ami b is equal to the rank of the augmented matrix 
of the coefficients 



^11 • • • ^ml 


^in a 


mn ‘-'n 


obtained from the matrix (7) by adding a column of free 
members. 

This proves the following theorem. 

Theorem 2 (Kronecker-Capelli theorem). The system of 
linear equations (6) is compatible if and only if the rank of 
the matrix of its coefficients (7) is equal to the rank of the 
augmented matrix (8). 

Let system (6) be compatible. How can all of its solu¬ 
tions be found? 

Let r be the rank of the matrix (7). On interchanging the 
equations and renaming (if necessary) the unknowns we 
may assume without loss of generality that 




... a^^ 


fljj* ... a^jt 


^ 0 . 


Since under the hypothesis system (6) is compatible, the 
rank of the matrix (8) is by the Kronecker-Capelli theorem 
also e(|ual to r. This means (in view of condition (9)) that 







26 


Semester 2 


the first r rows of the matrix (8) (i.e. the first r equations 
(6)) are linearly independent and that any other row of the 
matrix (8) (any other equation (6)) is a linear combination of 
them. Therefore system (6) is equivalent to the system 

^11^1 ~t” * • • ~l“ “1” * • * ^ml^m ^1? 

( 10 ) . 

^Ir^l ~1~ * * • ^rr^r “1“ * * * ~1“ ^mr^m 

consisting of its first r equations, i.e. that any solution of 
system (6) is a solution of system (10) and conversely any 
solution of system (10) is a solution of system (6). Thus 
everything has reduced to the solution of system (10) consist¬ 
ing of linearly independent equations. 

To solve this system we rewrite it in the form 

^11^1 CLri^T “ ^r+1, l^r+1 • • * 

( 11 ) . 

“h • * • “h drr^r ~ ^r+1, r^r+1 • • * 

If we assign to the unknowns Xr+i, . . arbitrary val¬ 

ues, then system (11) becomes a system of r equations in r 
unknowns Xi^ . . ., with a nonzero (by (9)) determi¬ 
nant A. We can therefore find the unknowns Xi, . . ., 
in a unique way by Cramer^s formulas we know from the 
algebra course. It is clear that this method gives us all 
solutions of system (10) (i.e. of system (6)). 

In practice, there is no need of course to interchange 
the equations in advance and to rename the unknowns. 
The procedure for solving an arbitrary system of linear 
equations (6) is therefore as follows: 

Stage 1. Computing the minors of the matrix of the 
coefficients (7) we find its rank r simultaneously discovering 
at least one nonzero minor A of order r. 

Stage 2. Bordering the found minor in the matrix (8) 
we see that the rank of the matrix is also equal to r, (If 
it is greater than r, i.e. equal to r + 1, then system (6) 
is incompatible.) At this stage it is obviously sufficient 
to compute only n — r minors of order r + 1. 

















Lecture 2 


27 


Stage 3. The minor A contains the coefficients of r 
unknowns in r equations. Leaving only these equations, 
assigning to the other n — r unknowns arbitrary values and 
obtaining in this way a system of r equations in r unknowns 
with a nonzero determinant we solve that system by Cramer’s 
formulas. Thus we find the values of the other r unknowns. 

The values obtained at stage 3 for the unknowns . . . 
. . Xjn are solutions of system (6) and any solution of this 
system can be obtained in this way. 



Lecture 3 


Direct sums of subspaces • Decomposition of a space as a direct 
sum of subspaces • Factor spaces • Homomorphisms of vector 
spaces • Direct sums of spaces 


Let and Q, be subspaces of a linear space T. Recall that 
their sum ^ ^ consists of all vectors of the form x + y, 

where x 6 Y ^ 

Definition 1. A subspace ^ ® is said to be a direct 

sum of the subspaces ^ and (S if each of its vectors can be 
uniquely represented as x + y, x 6 y 6 

In this case we write 0^ ® (S, or 0^ instead of 

0^ (S,» 

Proposition 1. A subspace 0^ + & is a direct sum of the 
subspaces 0 and ^ if and only if these subspaces are disfoint, 
i,e. 0 [] @ = 0. 

Proof. If we have the equation x + y = Xi + yi, where x, 
xi 6 ^ and y, yi 6 then) the vector x — Xi = yi — y 
lies in 0 [] IS,- Therefore, if 0 ^ = 0^ then x = Xi 

and y = yi, i.e. the representation of each vector of 0 + 
as X + y, X 6 y 6 S is unique. Conversely, if ^ fl & 

0 and Si ^ 0 f\ a ^ 0, then for any vectors x ^ 0, 
y 6 ® we have the equation 

X 4- y = (x + a) + (y — a), 

where x + a ^ 0 and y — a 6 showing that the repre¬ 
sentation of vectors of 0 + & as x + y, x6^, Y ^ & 
is not unique. □ 

It makes sense of course to speak of a direct sum of an 
arbitrary number of subspaces as well. For example, a sum 
0 -i- ^ ^ of three subspaces is said to be direct if the 











Lecture 3 


29 


representation of each vector of(f?^ + (S + ^asx + y + z, 
where x 6 y 6 (S, z ^ is unique. By analogy with 
Proposition 1 one would like to think that for this to be 
the case it is necessary and sufficient that the spaces c^, (g 
and ^ should be mutually disjoint. This is incorrect. For 
any two noncollinear vectors a and b, for example, the 
subspaces ^ = [a], (g = [b], = [a + b] are mutually 

disjoint, nevertheless their sum (^ + (g + .^ = [a, b] is 
not direct. 

The true condition for a sum (9^ + (g + to be a direct 
sum is given by the following proposition: 

Proposition 2. A sum <9^ + 6! + oj three subspaces is 
their direct sum if and only if each of them is disjoint from 
the sum of the other two: 

(1) ^ n (fi+^)=o, (g n M n (^+(g)=o. 

Proof. If we have the equation x + y + z=xi4' yi+^i, 
where x, xi 6 y, yi 6 (2, z, Zi 6 then x — Xi = 
= (yi — y) + (zi — z) 6 ^ n + M)- Therefore, if Xi ^ 
^ X, then ^ n ((2 + 0. Similarly, if yi ^ y, then 

® n + J?) = 7 ^ 0, and if Zi ^ z, then M{\ ((^ + ^) 0. 
Thus if the sum ^ + (g + is not direct, then not all 
conditions (1) hold. Conversely, if, for example, ^ fl 
n 0 and a 6 <9^ n a = 7 ^ 0, then 

for any vectors x6<9^, y66I, z^^we have the equation 

X + y + z = (x — a) + (y + b) + (z + c), 

where b 6 c 6 ^ are vectors such that a = b + c and 
therefore 1?^ + (g + is not a direct sum. □ 

Of course a similar proposition is true for a sum of any 
number of subspaces as well. 

Of particular significance is the case where ^ ® Q, = TT. 
In this case the vector space Y* is said to be decomposed as 
a direct sum of the subspaces ^ and (g. 

Consider the following properties of the subspaces 
and (g: 

1° Any vector in T is of the form x + y, where x 6 c9^, 

y G i*®' Y = ^ (g* 

2° The subspaces ^ and 61 are disjoint, i.e. 1!^ fl 61 = 0. 



30 


Semesfer 2 


3° The sum of the dimensions of the subspaces aP and @ 
is equal to the dimension of the space TT : 

dim ^ + dim (2 = dim 5^. 

Proposition 3. Any two of properties 1°, 2®, 3° imply 
the third. 

Proof, If there hold properties 1° and 2°, then by the 
theorem on the dimension of a sum (see Theorem 1 of Lec¬ 
ture 1) 

dim 5^ = dim ((^ + 6!) = dim + dim (2 + dim f| §) — 

= dimaj^ + dim (2. 

If there hold properties 1° and 3°, then by the same theorem 

dim ((3^ f| 61) = dim (c^ + 6) — dim oP — dim (2 = 

= dim T — dim — dim (2 = 0. 


and hence f\ (2 = 0. 

If there hold properties 2° and 3°, then again by the same 
theorem 

dim ((9^ + (2) = dim ^ + dim (2 = dim 5^, 
and hence + (^ = T'. □ 

According to Proposition 1 properties 1° and 2° imply 
that 5^ = ^ © (2. This proves the following 

Corollary. The equation 5^ = (3^ ©61 holds if and only 
if any two of properties 1°, 2°, 3° {and hence also the third) 
hold. 

Definition 2. If T = ^ ® then the subspaces ^ 
and (2 are said to be complementary. 

Proposition 4. If sabspaces ^ and (2 are complementary, 
then for any basis ej, . . ., Bp of the subspace ^ and any basis 
ep+i, . . ., of the subspace 61 the vectors 

01, , , ,, Bp, Op-|-i7 • • •? 

form the basis of a space f^. 

Conversely, if an arbitrary basis Ci, ,,,, b^ of a space 
is partitioned into two subfamilies Ci, . . Bp and ep+i, . . . 

. . ., then the subspaces ^ = [ei, . . ., epl and 61 = 

= [op+i, . . ., e^] are complementary. 









Lecture 3 


31 


Proof. In the first statement the vectors ei, . . Cp, 
ep-f-i, . . form a complete family consisting oi n = 

~ p q vectors. It is therefore a basis. In the second 
statement the subspaces and have properties 1° and 3° 
of those indicated above. Therefore Y* = SP ® □ 

Corollary. For any subspace cz Y there exists a comple¬ 
mentary space Q. 

Proof. Let ei, ..., ep be an arbitrary basis of a subspace SP. 
Supplement this basis with some vectors ep+i, . . ., to 
form the basis of the whole space Then a subspace (2 = 
= [©p-f-i, • • •, is complementary to (9^.□ 

We see that a complementary subspace is constructed 
with a lot of arbitrariness. It turns out that there exists 
a construction allowing us to avoid this arbitrariness (if 
only partially). 

Let be an arbitrary subspace of a vector space Y. 

Definition 3. Vectors x, y 6 are said to be congruent 
modulo (3^ if X — y 6 this case one writes 

X = y mod 

The congruent relation is obviously an equivalence rela¬ 
tion. Corresponding sets of vectors congruent modulo 
are called cosets of the space TT modulo the subspace 
It is clear that a set containing the vector x consists of all 
vectors of the form x + a, a 6 We shall designate it by 
the symbol x + (9^. Another widespread designation is x 
mod <9^, 

It is easy to see that congruences can be added together 
and multiplied by numbers, i.e. if 

X = y mod ^ and Xi s yi mod 


then 


X + Xi = y + yi mod ^ 


and 


/cx = /cy mod ^ 

for any number fc g K. Indeed, if x — y 6 ^ arid xi — yi G 
6 then (x + Xi) — (y + yj = (x — y) + (xi — yi) € 
and, similarly, fcs — fcy = fc (x — y) 6 












32 


Semester 2 


For cosets this means that the formulas 

(2) (x + + (y + .9^) = (x + y) + 

and 

(3) fc (x + (9^) = /cx + 

correctly define their sum and product by a number. 

A direct check shows that these operations satisfy the 
vector space axioms. Thus under operations (2) and (3) 
a set of all cosets Y* modulo 0^ is a vector space, □ 

Definition 4. This space is called the factor space of a space 
f" modulo a subspace 0^. It is designated by the symbol 

ri0^. 

In the first semester course in algebra a similar construc¬ 
tion was studied in detail for the case of groups and rings. 

Proposition 5. Every subspace complementary to a sub- 
space # is isomorphic to a factor space 

Proof, Consider a mapping (p: ^ defined by the 

formula 

(p (x) = X + (9^, where x 6 

If (p (x) = (p (xi), i.e. X + 0^ = Xi + then x — Xi 6 ^ 
and hence x = Xi. On the other hand, any vector z 
has the form x + y, where x ^ ^ 0^, and hence z + 

= X + 1?^. This proves that the mapping (p is bijective. 
Since the mapping q> obviously preserves sums and products 
by numbers, it is therefore an isomorphism. □ 

The geometrical fact underlying Proposition 5 is that 
every coset modulo 0^ has a unique vector in common with (^. 

Proposition 5 implies that instead of complements ^ 
we may consider the factor space Y'/^ whose construction 
contains no arbitrariness. 

It follows from Proposition 5 that 

(4) dim Y"I0 — dim Y" — dim 0, 

Indeed dim Y'10 dim & = dim ^ — dim 0, □ 

Let Y' and W* be two vector spaces. 

Definition 5. A mapping 

(p \T-^W 






Lecture 3 


33 


is said to be a linear mapping or homomorphism (or simply 
morphism) of vector spaces if it preserves linear operations, 
i.e. if 

<p (x + y) = <p (x) + (p (y) 


and 

(p (/cx) = ftcp (x) 

/ 

for any vectors x, y 6 aitd any number & 6 K. 

Thus the difference between homomorphisms and iso¬ 
morphisms is solely in that a homomorphism is not necessa¬ 
rily a bijective mapping. 

Definition 6. The totality of all vectors x mapping 
under homomorphism (p into the zero of the space W is called 
the kernel of the homomorphism (p and designated by the 
symbol Ker (p. Thus 

Ker (p = {x 6 <P (x) = 0}. 

Definition 7. The totality of all vectors of W having 
the form (p (x), x 6 is called the image of a homomorphism 
(p and designated by the symbol Im (p: 

Im (p = {y 6 y = q) (x)}. 

Sometimes Im (p is designated by the symbol (p (5^) and 
called the image of a space T under homomorphism (p. 

It is obvious that the sets Ker (p and Im cp are subsets 
(of the sets V and fF respectively). 

The factor space W/Im(p is designated by the symbol 
Coker (p and called the cokernel of a homomorphism (p. 

A homomorphism cp is said to be a monomorphism if it is 
an injective mapping, i.e. if (p (x) = (p (xi) when x ^ Xi. 

A homomorphism cp is said to be an epimorphism if it 
maps F onto i.e. if for any vector y there is a 
vector X 6 such that y = (p (x). 

Thus a homomorphism cp is an isomorphism if and only 
if it is simultaneously a monomorphism and epimorphism. 

By definition a homomorphism (p is an epimorphism if and 
only if Im (p = W, i.e. if Coker (p = 0. □ 

Similarly it is easy to see that a homomorphism is a mo¬ 
nomorphism if and only if Ker (p = 0. Indeed, if (p (x) = 
= (p (xi), then (p (x — Xi) = 0, and therefore x — Xi 6 


3-01325 




34 


Semester 2 


6 Ker (p. Consequently, if Kerq) = 0, then x = Xi* Converse¬ 
ly, if it follows from (p (x) = (p (xi) that x = Xi, then in 
particular (p (x) = 0 if and only if x = 0. Consequently 
Ker (p = 0. □ 

If Ker (p = 0, then cp is obviously an isomorphism of a 
space f" onto a subspace Im (p ci W. Therefore dim Im (p = 
= dim T. It follows that if Ker (p = 0 and dim 5^ = 
= dim then the homomorphism (p is an isomorphism. 
Indeed then dim Im q) = dim and hence Im (p = W. □ 
When Ker (p 0 it is appropriate to introduce a factor 
space 

r/Ker (p 

which is sometimes called the coimage of a homomorphism (p. 
It is obvious that the formula 

(p' (x + ^) = (p (x), x^T 

correctly defines some homomorphism 

q)': r/Ker q) 

called an induced homomorphism and it is not hard to see 
that the homomorphism q)' is an isomorphism of the factor 
space ^/Ker q) onto a subspace Im q). 

In particular we see that for any epimorphism q): 5^ 
the space W is isomorphic to the factor space ?^/Ker q). □ 
Furthermore, since dim q) = dim 5^ — dim Ker q), 

for any homomorphism (p: we have the formula 

(5) dim Ker q) + dim Im q) = dim T*. 

All these statements, except for formulas (5), are of a 
very general character and are correct for any groups and 
rings, as we know from the first semester course in algebra. 

We now return to direct sums. 

Let ( 3 ^ and 61 be arbitrary vector spaces (over the same 
field K). Consider the set f" of all pairs of the form (x, y), 
where x 6 ^7 y 6 (S* Setting 

(x, y) + (xi, yi) = (x + Xi, y + yi) 

and 

k (x, y) = (fcs, A:y), 
f" obviously becomes a vector space. 


] 










Lecture 3 


35 


Definition 8. The constructed space TT is called a direct 
sum of the spaces and (or sometimes an external direct 
sum, to distinguish it from the “internal” direct sum consid¬ 
ered above, when the space T was preassigned and and 
were its subspaces). 

We are justified in using this terminology because the 
vectors of jr having the form (x, 0), x 6 ^ constitute a sub- 

space Sf^ isomorphic to the space and those having the form 
(0, y), y € ^ constitute a subspace ^ isomorphic to the 

space Besides, the subspaces and ^ are disjoint 
(have only the zero vector (0, 0) in common) and sum to 
the whole of T (for (x, y) = (x, 0) + (0, y)). Thus T = 

= ^ ® i,. 

^ is usually identified with ^ and & with ^ and we 

write 5^ = ^©61 or T" = This causes no am¬ 

biguity. 

The construction of the external direct sum was also 
encountered in the first semester course in algebra in con¬ 
nection with the case of groups. Actually, it is this construc¬ 
tion that we used in the first semester when we constructed 
complexifications. 

In our next lecture we consider constructions that are 
more specific for the theory of vector spaces. 


3 * 



Lecture 4 


The conjugate space • Dual spaces • A second conjugate 
space • The transformation of a conjugate basis and of the 
coordinates of covectors • Annulets * The space of solutions 
of a system of homogeneous linear equations 


Let 5^ be an arbitrary vector space over a field K. 

Definition 1. A function 5^ -->K is said to be a linear 
functional if it is a homomorphism of vector spaces i.e. 

S (x + y) = Mx) + i (y) 


and 


I (kx) = kl (x) 

for any vectors x, y 6 and any number & 6 !K. Linear 
functionals are also called the covectors {covariant vectors) 
of the space TT, 

A direct check shows that a sum ^ of two linear func¬ 
tionals ^ and (defined by the formula + i]) (x) = 
= 5 (x) + T) (x)) and a product k\ of a linear functional \ 
by an arbitrary number k (defined by the formula (fc^) (x) = 
= (x)) are linear functionals. This means that the set 

of all linear functionals is a subspace of the space of all 
functions in ‘T and hence is itself a vector space. This 
vector space is designated by the symbol Ti (5^) or 5^'. 

Definition 2. A vector space T' is called a space conjugate 
to a space T, 

Let 01 , . . ., be an arbitrary basis of a space T. 


1 












Lecture 4 


37 


Proposition 1. The value ^ (x) of an arbitrary linear func¬ 
tional ^ on a vector x = x^ei + • • • + expressed 

by the formula 

(1) ? (x) = lia:‘-t-... -h 

where 

(2) li = I (ei), . . In = I (e„). 

For any numbers K” formula (1) uniquely 

gives some linear functional ^ 6 5^' for which we have (2). 

Proof, Formula (1) directly follows from the property of 
linearity 

?(x) ... -\-x^en)-=x^l{e^)+ ,,, -\-x^l{^n) = 

”f” • • • “[” ' 

Conversely, if the functional ^ is given by formula (1) then 
g (x + y) - {x^ + y^) + ••• +ln = 

= liX^-[- ... + + liy^ + • •• + ^ (x) + ? (y) 


and 

l{kx)=l, {kx^) + .,,^l, {kx^) = 

= k {liX^ + . . . + = kl (x) 


for any vectors x, y G ^ and any number k G K. Besides, 
5 (®i) = ?1*0 + • • • + * 1 + • • • + *0 — ?i* □ 

It follows from Proposition 1 that the formula 

f 0 if i^jj 

(Oy) — I ^ j i ~ j 1 » * • • ? 

uniquely determines n linear functionals 
(3) e^ . . ., e^ 

It is clear that for any vector x G f ' 

(x) = a:\ j = 1, . . ., 



38 


Semester 2 


Proposition 2. Functionals (3) form a basis of a space 5^'. 
The coordinates of an arbitrary functional 5 in the basis are 
the coefficients (2) of its representation (1): 

(^) ^ + • • . H~ • 

Proof. For any vector x = ;r^ei + . . . + and any 
numbers 6 IK we have 

(?ie^ + ... + (x) = (x) + .. . + (x) 

= —j— . . . -]— 


Consequently, if are the coefficients (2) of the 

functional then + • • • + (x) = 5 (x) for any 

vector X This proves formula (4) and the completeness 
of the family e^, . . e” in 5^'. 

On the other hand, if 

then for any i = 1, . . n 

h = (5ie^ + ... 4 Ine") (eO - 0. 

Consequently, the family e^, . . is linearly independent 
and hence is a basis. □ 

Corollary. 

dim r' = dim T. 

The basis e^, . . ., is said to be conjugate to a basis 

6i, . . ., 

In the Einstein notation formula (1) has the form 

g(x)=gia;‘ 

and formula (4) has the form 


Let T and W be two vector spaces over a field K. Suppose 
that any two vectors x G y 6 '11^ are assigned a number 
(x, y) 6 K such that the following conditions hold: 
















Lecture 4 


39 


(i) for any fixed y the function x (x, y> is a linear 
functional in 5 ^, i.e. 

<xi + X 2 , y) = (xi, y) + (x 2 , y>, 

(fat, y) = /i:(x, y> 

for any vectors Xi, X 2 , x 6 5^ and any number /c 6 IK; 

(ii) for any fixed x 6 5 ^ the function y (x, y) is a 
linear functional in W, i.e. 

yi + y 2 > = (x, yi> + (x, ya), 

<x, ky) = A:(x, y> 

for any vectors yi, ya, y and any number A: 6 K; 

(iii) for any vector x 6 ^ there is a vector y such 

that (x, y> = 7 ^ 0 and conversely for any vector y there 

is a vector x such that (x, y> ^ 0 . 

Conditions (i) and (ii) are called the bilinearity conditions 
and condition (iii) is called the nonsingularity condition. 

Definition 3. The function x, y (x, y) satisfying condi¬ 
tions (i), (ii), (iii) is called a pairing between spaces T* 
and W*- The spaces Y* and W for which there exists at least 
one pairing are said to be dual. The notation is Y* | 

Note that the dual relation is obviously symmetrical, 
i.e. if T I W, then W | T. 

Proposition 3. Any vector space Y" is dual to a conjugate 
space 5 ^', i,e. 

T \T'. 

Proof, For any x 6 ^ and ^ set 

(x, I) = I (x). 

It is obvious that the bilinearity conditions (i) and (ii) 
hold (for example, (x, h + I2) = ili + I2) (x) = h (x) + 
-f- ^2 (x) = (x, li> + ^x, ^ 2 ))* The inequality ^ = 7 ^ 0 implies 
that there exists a vector x^Y such that g (x) ^ 0 . GoU' 
sequently (x, ?) = 7 ^ 0. Similarly the inequality x = 7 ^ 0 im¬ 
plies that ^ 0 for at least one i^ and so for | = e^o we 
have (x, g) = | (x) = ^ 0. Thus condition (iii) also 

holds. □ 



40 


Semester 2 


The converse is true if stated as follows: 

Proposition 4. If spaces f and W are dual, then either 
of them is isomorphic to the space conjugate to the other: 

T ^ W 

Proof. By symmetry of the dual relation it is enough to 
prove only the first of these isomorphisms. Let x 6 5^* 
According to condition (ii) the function y (x, y> is a linear 
functional in W, i.e. a vector of a space W'. Denoting this 
linear functional by q) (x) we therefore obtain a certain 
mapping 

(p: r 

Thus by definition 

<P (x) (y) = <X, y>. 

Therefore by condition (i) 
q) (Xi 4- Xa) (y) = (Xj + Xj, y> = 

= (Xi, y> + <X2, y) = (P (xj) (y) + <p (Xa) (y), 

i.e. 

(p (xi + Xa) = (p (xi) + (p (xa). 

Similarly 

(p (Ax) (y) = (Ax, y) = A(x, y) = Acp (x) (y), 
i.e. 

(p (Ax) = A(p (x). 

This proves that the mapping (p is a homomorphism. 

If (p (x) = 0, then (x, y) = 0 for all y and hence 
(condition (iii)) x = 0. Thus Ker (p = 0. Therefore Im (p 
T* and hence dim T = dim Im (p^dim W. 

But by symmetry of the dual relation, if the inequality 
dim T ^dim W holds so must the inequality dim W ^ 
^dim 5^. Consequently dim T = dim W and therefore 
in particular dim Im cp = dim i.e. Im (p = This 
proves that the homomorphism cp is an isomorphism. □ 

Since | (Proposition 3) we have?^' | (symmetry), 
and hence T' ^ (^')' (Proposition 4). This result is so 
important that deserves to be ranked as a theorem. 






Lecture 4 


41 


Theorem 1. A space {T'Y conjugate to the conjugate is 
isomorphic to the original space\ 

(ry D 

In explicit form the isomorphism is given by 

the correspondence associating with a vector x G a func- 

tional X in 5^' defined by the formula 

X (I) = I (x). 


As a rule the functional x is identified with the vector x 
and therefore denoted in particular simply by x. 

On the face of it Theorem 1 appears to be a trivial con¬ 
sequence of the fact that spaces and (T^'Y are equidimen- 
sional. But in fact it means that there is a “natural” isomor¬ 
phism, V' between the spaces T' and (T'Y^ that 

can be constructed without any arbitrariness. It is this fact 

that allows us to identify x and x (and hence (5^')' and 5^). 

Spaces V and T' too have the same dimension, but we 
cannot establish any natural isomorphism between them in 
the general case. At present we lack the necessary concepts 
for proving this statement (for example, we lack an accurate 
definition of what a “natural” isomorphism is) and therefore 
we are forced to restrict ourselves to the proof that even the 
simplest, one would think, and most natural attempt to 
construct such an isomorphism fails. 

Let 01 , . . ., 6,1 be an arbitrary basis of a space and 
e^, . . ., a conjugate basis of a space 5^'. We may try 
to consider an isomorphism 5 ^ acting by equating 

the coordinates in these two bases (this isomorphism asso¬ 
ciates with every vector x = x^ei -f . . . -f a covector 
g = -f . . . -f x^e’^ having in the basis e ^, . . ., the 
same coordinates that the vector x has in the basis ei, . . ., e^) 
hoping to find it independent of the basis ej, . . ., (and 
therefore “natural”). But this hope is not realized. 

To show this it is necessary to consider in a general form 
the transformation of the coordinates of covectors when 
changing the basis ej, . . ., 





42 


Semester 2 


We make the computations involved using the Einstein 
notation. To do this it is appropriate to introduce the so- 

called Kronecker delta-symbol 6 i defined by the formula 

.i f 0 if »¥=/ 


,_J 0 if i #7 

t 1 if i=h 


The main property of the symbol is expressed by the for¬ 
mulas 

a^8i = a\ bjbi = bi 

(indeed, the terms of the left-hand sums, except, respectively, 
the terms a^ *l = and bi-l = 6 ^, are all zero). 

With the aid of the Kronecker delta-symbol the defining 
property of the conjugate basis can be written as a single 
formula: 

e^’ (Cj) = 61 

*9 * 

Similarly the fact that matrices {c \) and {c\>) are recip¬ 
rocal can be written down in the following two equivalent 
forms: 

Cf C{» = Of, Cf Cjf = 

With all this in mind consider, along with the basis 
Cl, ..., 6^1 another basis ei^ . . ., en' for which 


— CftCfy Cl — e^'. 


where C — {c\^) is a transition matrix and = {c\) is the 
inverse matrix. Then, as we know (see Lecture 11 in [1]), 
for the coordinates x'^ and we have the formulas 


x'^ — c\^x^\ 


•a i* S 


Now let . . ., be a basis conjugate to the basis 

ei , . . ., Then by definition 


(^v) = 6 ^* 


Consequently 


e^' (Ci) = (cUv) = = ci . 


i'sij' _ J' 





Lecfure 4 


43 


But according to Proposition 2 

|=^(ei)e‘ 

for any covector ^ 6 5^^ Therefore in particular 

= Ci'e^ 

and hence 


(the last formula can be written by symmetry or obtained 

• < 2 */* /t* • 

by computation: cl^e^' = Ci*c) = e’^). Similarly 

for the coordinates ? (e^) and = ? (®/) of an arbit¬ 

rary covector ^ we have 


(Cj) 




i.e. 


li — ii* 

and, by symmetry (or by the same computation), 

We see that the covectors of a conjugate basis are transformed 
as the coordinates of vectors and correspondingly the coordinates 
of covectors are transformed as the vectors of a basis. 

It is customary to call the transformation of a basis 
cogredient and the transformation of the coordinates of 
vectors (i.e. the transformation with an inverse and a trans¬ 
posed matrix) contragredient. Thus conjugate bases are trans¬ 
formed contragrediently and the coordinates of covectors are 
transformed cogrediently, □ 

Therefore, if in some single basis (and in that conjugate 
to it) a vector x and covector 5 had the same coordinates, 
then in another basis, because the coordinates of vectors 
and covectors are transformed by different formulas, the 
vector X and covector ^ will have different coordinates. 
Consequently, mapping by equating the coordinates in 
conjugate bases is basis-dependent and so is not natural. 



44 


Semester 2 


Let 5 c: 5^ be an arbitrary subset of a vector space f". 
Definition 4. The totality of all linear functionals 
equal to zero on any vector x ^ S is called an annulet of the 
set S and designated by the symbol Ann S or 5°. 

Thus 

Ann S' = {g ^ ? (x) = 0 for any x 6 S}. 

It is obvious that S° is a subspace of the space 5^'. And 
if Scz T, then S° id r. □ 

Proposition 5. The annulet of an arbitrary set S cz JT coin¬ 
cides with that of its linear span 

Ann S = Ann [S], 

Proof. Since S cz [S], then S° id [S]°. Conversely, let 
S 6 *5®. Then for any vector k^x^ + . . . + of [S], where 
Xi, . . Xm ^ S, we have the equation 

? (^iXi -f- . . . 4“ kjfiXjn) = (Xi) . . . 4“ kjYi^ (X/n) ~ 

since g (xi) = 0, . . g (xm) = 0. Consequently, I 6 ISV, 
i.e., S° c= [Sr. □ 

According to this proposition consideration of annulets 
may be restricted to subspaces. 

It is clear that Ann 0=5^' and conversely if Ann S = T*' 
then S = {0} (for if S (x) = 0 for all ^ cz , then x = 0). □ 
Similarly Ann 5^ = 0 and if Ann S = 0, then [S] = 5^. 
Indeed, if [S] ^ and if ei, . . ., is a basis of the space 
such that IS] = [ei, . . ., e^^], m ^ n, then 6 [*5]® and 
therefore S° 0. O 

Proposition 6. For any subspace 3^ cz TT we have the equa¬ 
tion 

dim = n — dim <9^. 

Proof. Let dim = p and let ei, . . ep, . . ., be a 
basis of the space T* such that ^ = [ei, . . ., Bp]. Consider 
a conjugate basis 

pP 

^ ^ ^ ^ ^ ^ ^ • 

U i ^ p and j > p, then it is automatic that i ^ j and so 
o’ (Oj) = 0. Therefore e^+4 . . ., 6 [oi, . . ., epj® = 9^®. 

On the other hand, if ^ 6 then g (ei) = 0, . . ^ (ep) = 0 

and hence g = Sp+ie^+^ + . . . 

This proves that the covectors form a basis 

of the subspace Therefore dim = w — p. □ 





Lecture 4 


45 


Since (Theorem i) T = in all that was said above 

‘T can be replaced by TT' and T' by ‘T, In particular, this 
will determine for any set 5 c: 5^' a subspace Ann S cz T* 
consisting of vectors x 6 ^ such that x (|) = 0 (i.e. g (x) = 
= 0) for any covector ^ ^ S, and the dimension of that 
subspace will be equal to n ■— r, where r is the dimension of 
the subspace [5], i.e. the rank of the set S. 

Thus, firstly, subspaces of the space can be given not 
only as linear spans, but also, “dually” as annulets of sets 
of covectors S = {|i, . . ., h®* by equations of the 

form 

(5) h (x) = 0, . . (x) = 0. 


Secondly, we have an effective way of computing the 
dimension of a subspace given in this way: it is equal to 
— r, where r is the rank of the set {?i, . . ., ^rn}> 

It is appropriate to restate all this in terms of coordinates. 
Covectors ^i, . . ., are written in coordinates (Pro¬ 
position 1) as linear forms in . . ., re”. Therefore equations 
(5) take in coordinates the form 



+ • • • + = 0 , 




i.e. are ordinary homogeneous linear equations. We thus 
obtain the following theorem: 

Theorem 2. The set of all solutions {x^, . . ., x^) of system 
(6) of homogeneous linear equations is a subspace of the space 
of dimension n — r where r is the rank of a matrix of the 
coefficients 

^11 * * * ^in 


^ml * * • ^mn 

To find the basis of that subspace, i.e. n — r linearly inde¬ 
pendent solutions (which are usually called a fundamental 
system of solutions), it is necessary in solving system (6) 






46 


Semester 2 


by the method described in Lecture 2 to assign to w — r 
“free” unknowns n — r sets of values seeing to it that linearly 
independent solutions result. To do this it is enough to choose 
the indicated sets in such a way that on being arranged as 
a square matrix of order n — r they should form a non¬ 
singular matrix (it is easiest to choose them in such a way 
that a unit matrix should results. 





Lecture 5 


An annulet of an annulet and annulets of direct summands • 
Bilinear functionals and bilinear forms • Bilinear functionals 
in a conjugate space • Mixed bilinear functionals • Tensors 


The fact that annulets are defined also for subsets of a space 
5^' allows us to speak of an annulet of an annulet 

Ann Ann S = 5®° 

of an arbitrary subset S a T- 

Proposition 1. For any subspace 0^ cz there holds the 
equation 

= 0^. 

Proof. If X 6 then g (x) = 0 for any g i.e. 

X (5) = 0. This means that x 6 Thus cz 0^ and 
hence 0^^ = 0, for dim = n — dim 0° = n — 
— {n — dim 0) = dim 0. □ 

If, on the other hand, S is an arbitrary set, then obviously 

goo _ 

Proposition 2. If T* = 0 ® then ^' = #° © 

And 0° ^ (S' and » 0\ 

Proof. Let dim 0 = p and dim (S = g. Then p -f- g = w 
and 0 (S = 0. Therefore dim 0° + dim = {n—-p)+ 
+ {n — q) = n. Besides, if S 6 fl then g (x) = 0 
for any x ^ 0 and | (y) = 0 for any y 6 61. Therefore 
g (x + y) = 0 and hence | (z) = 0 for any z Consequent¬ 
ly 5 = 0, i.e. n = 0. This proves (see the corollary 
of Proposition 3 in Lecture 3) that TT* = 0"^ ® (S°. 


4g 


Semester 2 


Now associate with every linear functional ^ 6 Its 
restriction 


to the subspace In this way we obtain a certain mapping 
^ of the space oT*® into a space (J', which is obviously 

linear (a homomorphism). Its kernel consists of all func¬ 
tionals ? 6 ioT which 51^=0, i.e. such that ^ ^ 61°. 

But according to what has been proved 5^° f] = 0. 
Therefore the mapping g is a monomorphism. 

Let r\ ^ . Define in 5^ a functional | setting for any 

vector of the form x + y, where x 6 y 6 61, 

? (x + y) = T] (y). 

It is clear that the functional ^ is correctly defined, linear, 
and belongs to and that] = i]. This proves that the 
mapping | is an isomorphism. 

The isomorphism 6° ^ can be proved in a similar 
way. □ 

Note that isomorphisms of Proposition 2 are “natural”. 

Definition 1. The function 5: x, y “>B (x, y) 6 K of two 
vector arguments x, y 6 Is said to be a bilinear functional 
in TT if for every fixed value of one argument it is a linear 
functional of the other, i.e. 

B (xi + X 2 , y) = i? (xi, y) + B (x^, y), 

B (fat, y) = kB (x, y) 

and 

^ (x, yi + ya) = 5 (x, yi) + B (x, yj), 

B (x, ky) = kB (x, y) 

for any vectors xi, X 2 , x, yi, y 2 , y 6 and any number 
fc 6 K. 

One example of a bilinear functional is a scalar product 
(x, y) (see Lecture 13 in [1]). Pairings introduced in Lecture 4 
are also bilinear, but their arguments are in general in differ¬ 
ent spaces. Extending the theory set forth below to this 
case presents no fundamental difficulties, but is rather 
tedious. So we shall not take it up. 




Lecture 5 


49 


Let 01 , . . be an arbitrary basis of a space f\ Setting 
(1) bij = 5 (ej, e;) 

we obtain for any two vectors x = and y = the 
equation 

^ (x,.y) = B (Ci, e^) xV = bijX^yK 
This proves that 


fi(x, y)^bifX^y^ = 



n n 


S 2 bijxy = 

M • M 


= biiX^ + . . . + V + 
H~ b^xX^y^ + • • • “h b^j^x^y^ -f- 
+ bniX^y^ + . . • + bnnX'^y'^- 


As we know (see Lecture 14 in [1]) the algebraic expression 
on the right is called a bilinear form in x^ and 

y'^. Thus any bilinear junctional is expressed in coor¬ 
dinates as a bilinear form with the coefficients (1) (called the 
coefficients of a functional B to abridge the statements). 
Conversely, it is easy to see that any bilinear form gives 
(by formula (2)) some bilinear functional. Hence there is 
(for a given basis!) a bijective correspondence between bili¬ 
near functionals and bilinear forms. 

The coefficients (1) of a bilinear functional B form a matrix 



which is called the matrix of a bilinear functional B (in a 
given basis). 

It is clear that a sum of two bilinear functionals and a 
product of a bilinear functional by a number are bilinear 
functionals. This means that the totality T 2 (5^) of all 
bilinear functionals in the space f" is a vector space. □ 

When adding bilinear functionals their matrices are added 
together, and when multiplying a bilinear functional by 
a number its matrix is multiplied by the same number. This 


4-01325 




50 


Semester 1 


means that the correspondence associating with a bilinear 
functional its matrix is an isomorphism of a vector space T 2 ( 5 ^) 
onto a vector space of quadratic matrices of order n. □ 

With the aid of the matrix B formula (2) can be written 

B (x, y) = xT, By, 


where as always 



Cf. Lecture 14 in [ 1 ] where similar formulas were obtained 
for the scalar product. 

Let 1 ] 6 Ti {T^) be two linear functionals. It is clear 
that the formula 

il ® Ti) (x, y) = I (x) TJ (y) 

defines a certain bilinear functional ^ ® i]. 

Definition 2. A functional ^ ® r] is called a tensor product 
of the functionals ^ and r\. 

Consider in particular the tensor products ® of 
covectors of a conjugate basis. Since (x) = x'' and (y) = 
= we have 

(e" ® eO (x, y) = xy. 

For a functional B' = (e^ ® e^) we thus have the formula 

(3) B' (x, y) - bij (e" (g) (x, y) = bijx^. 

In particular 5' (e^, Cy) = b^, from which it follows that 
the bilinear functionals 0 e\ 7 = 1 , . . ., w, are linearly 
independent (if 5' = 0, then bij = 0). Besides, if we take 
an arbitrary functional B == T 2 i^ ) compose from its 
coefficients bij the functional B', then according to formula 
(3) we have B' = B, 

Thus we have proved the following proposition: 
Proposition 3. The tensor products 


e* 0 e\ 7 = 1 , . . ., w, 










Lecture ^ 




of the vectors of a conjugate basis constitute a basis of a vector 
space T 2 (^)* The coordinates of an arbitrary bilinear func¬ 
tional 5 6 T 2 (f ) l^hat basis are its coefficients bip 

B = bije^ ® eK □ 

In particular we see that 

Aiml^{T)^n\ 

Let us now take another basis: 

e^* = c\mGi • 

Then 

i 3 

bi'y — B (e^^, — Ci»uyB {Bi, Gf) - 


I T i. 
C\^Cybij» 


Thus, in the new basis the coefficients of a bilinear functional 

B are expressed by the formula 

• • 

bi'y = C{tCybij, 

In matrix notation this formula has the form 

5' = C^BC, 

where B = ( 6 ^;), 5' = {bi»y)^ and C — {c\^)- Cf. Lecture 14 
in [ 1 ]. 


Bilinear functionals B: r\ B r\) of the covectors 

G ^' ar© defined and studied in a quite similar way. 
The only change is in the position of indices. The values 
of every such functional are expressed by the formula 

B (I, T]) = b%r]}, 

where 6 *’ = B (e\ e^) and = | (Cj), t)j = t) (e^) are the 
coordinates of the covectors ^ and r\. For the other basis, 

= cl^Gi, we have 

Bilinear functionals of covectors constitute a vector space 

T2(r) = T2(r') 


4 * 



52 


Semester i 


of dimension A tensor product x ® y of the vectors x 
and y is called a functional in ) defined by the formula 

(x <8) y) (g, Ti) = I (x) T) (y). 

Tensor products of the form 

® ej, 7 = 1, . . n, 

constitute a basis of the space (5^), with 

B== b^^ei ® ej 

for any functional B (f^). 

Of greater interest is the case of bilinear functionals 

B: X, g H-> J? (x, g) 

one argument of which is a vector x 6 5^ and the other a 
covector % • We shall call such functionals mixed func¬ 

tionals, They also form an w^-dimensional vector space. 
We shall designate this space by the symbol T] (T). 

In coordinates the values of a mixed functional B are 
expressed by the formula 

B (x, i) = blx%, 

where b{ = B (e^, e^), while x^ = (x) and ^ (e^) 

are the coordinates of the vector x and covector ^ (in the 
conjugate bases ei, , , ,, en and e^, . . e’^). 

On defining the tensor product t] ® y of the covector r] 
and vector y by the formula 

(t) ® y) (x, 1 ) = r\ (x) I (y) 

we immediately see that tensor products of the form ® e; 
constitute a basis of a space T\ {T), with 

B = bie^ ® 

for any 5 6 TJ {T)^ □ 

In the basis 


ei' = 





Lecture 5 


53 


♦ f 

the coefficients of a functional B ^ T} (^) are expressed 
by the formula 



Li' i i' 



This is a type of transformation quite different from that 
for the coefficients of bilinear functionals of T 2 (^) or 
j 2 Jq visualize it, we shall write it in matrix notation 
(and at the same time derive it anew). 

Let 



and 




x = 





Then, as can easily be seen, 

B {x, I) = IBx. 

Further let 



and correspondingly 



and so 

As we know. 


B (x, 1) = l'B'x\ 
X = Cx' 


r = ic, i.e. I = rc-" 


and 





54 


Semester 2 


(the coordinates of covectors are transformed cogrediently). 
Therefore 


i.e. 


IBx = I'C-^BCx' = I'B'x', 


B' = C-^BC. 


This is precisely formula ( 4 ) in matrix notation. Instead of 
the transposed matrix there has appeared the inverse 
matrix 

A generalization suggests itself. 

Definition 3 . A (p, q)-tensor in a space where p, g > 0 , 
is an arbitrary function 

T , Xj, . .., Xp, ..., ^ 'I' (x^, • • • 7 Xp, ^ , . .., 

of p vector arguments Xi, . . ., Xp and q covector arguments 
. . ., which is linear in every argument (with the 
values of the others fixed), that is to say, multilinear. 

Thus bilinear functionals of vectors are ( 2 , 0 )-tensors, 
bilinear functionals of covectors are (0, 2)-tensors, and 
mixed bilinear functionals are (1, l)-tensors. 

Similarly, covectors are ( 1 , 0 )-tensors, and vectors, by 
virtue of the identification f" = (^')' are (0, l)-tensors. 

According to the general conventions about functions 
(0, 0)-tensors having no arguments at all are identified with 
elements of the field K. 

The set of all (p, g)-tensors is designated by the symbol 
Tp (^), a zero index being dropped. This is in agreement 
with the notation T2 {T), {T), and T\ (T*) introduced 

above for spaces of bilinear functionals, as well as with 
the notation Ti (T*) introduced for a conjugate space 5^'. 
According to what has been said above (T) = ^ and 
TJ (T) = K. 

It is clear that each of the sets Tp (^) is a vector space 
(under the ordinary linear operations on functions). 

Let ei, Bn be an arbitrary basis of a space f" and 

e^, . . ., a conjugate basis of a space Also let 

Xi = xi’eij, ..., Xp — 




Q 


7i® L • • • » — Si. ® ^ 





Lecture 5 


55 


Then by multilinearity 

(5) T (xj, ..., Xp, , 5 ^) = 


■*■ * * * *^p (^1 * • 




where 


= T (e{^, ..., 01 , ..., e^9), 


-*• • • * 7 Ip' 


The numbers are called the coefficients of a tensor T, 

Their number is equal to 

To reduce the formulas it is convenient to introduce the 
composite indices 

oc (^1, . . ., jp) and P (/i, . • .) /g)* 

Setting 




•••! % m 


and 


. .. ajpP, Ip — .. • 1;^, 

we can write formula ( 5 ) in the following reduced form: 

( 7 ) T{x^,...,xp, = 

This formula means that in coordinates any tensor can be 
expressed as a multilinear form 

« ^ 

Conversely, every multilinear form * • 

. . . Xp^^h ... I? gives by formula ( 7 ) some functional 

tq 

T , Xj, ..., Xp, 1^, ..., I » > T (xj, .. ., Xp, 1^, ..., 1^) 

which is obviously multilinear, i.e. a tensor. 

Thus, for a given basis e^, . . ,, of a space f" (p, q)-tensors 

are in bijectiue correspondence to multilinear forms 

• • 

i.e. sets {T^ = {Til[[[i^) of elements of the field K. □ 

Let us now transform from the basis ei, . . ., to a new 
basis 01', . . ., 0^'. Let 


0 i' — C|'0|# 


56 


Semester 2 


Then, by multilinearity, for the coefficients 



of 

( 8 ) 


the tensor T in the basis 

. / r 

1 = C• 

X\,.,X>p X\ 


ei#, . . ., we have 




cJc}] 

»p 


Jqrp\^'"^q 

• * n...tp 


This is the so-called tensor transformation law. We may 
say by convention that in formula (8) each index is trans¬ 
formed irrespective of the others, with the subscripts trans¬ 
formed cogrediently and the superscripts transformed contra- 
grediently. 

In contracted notation formula (8) has the form 

i a' — a> 

where 


c 


a 

a' 






Theorem 1 . Suppose every basis ei, . . ., e^i o/ a space T 

• • 

• *3 

has associated with it numbers numbers associated 

i 

with different bases being related to one another by the tensor 

transformation law (8). Then there exists in the space T* a 

unique (p, q)-tensor the coefficients of which in each basis 

• • 

3 \* * *3 

ei, . . ., en are the given numbers 

Proof. As was already stated above, the giving of numbers 

3 3 

= Ta in a given basis ei, . . e„ determines by 
the formula 



, Xp, 





X 


P 


\ 



some (p, ^)-tensor T. To prove Theorem 1 it is therefore 
sufficient to verify that in any other basis eis . . ., e^' that 
tensor has the given coefficients. But this is obvious, for 




Lecture 5 


57 


according to that which has been proved above the coeffi¬ 
cients of the tensor T in the basis ei^, . . are the num¬ 
bers cS'Cp Ta’ aiid under the hypothesis these numbers are 
precisely equal to □ 

According to this theorem tensors can be identified with 

• • 

sets of numbers related by formulas (6). It is in 

this form that tensors usually appear in physics. In this 

interpretation numbers are generally termed not 

coefficients of tensors, but tensor components. 



Lecture 6 


1 


Multiplication of tensors • The basis of a space of tensors • 
Contraction of tensors • The rank space of a multilinear 
functional 


As was already noted in the preceding lecture, tensors of the 
same type can be added together. It is clear that in doing 
so their coefficients (components) are also added: 


{T -f S)i^, 





j 

i 


<7 

P 


^ 1 3 

When tensors are interpreted as sets of numbers Til[,,i^ 

this formula is taken as the definition of their sum. 

However, defined for tensors besides the operation of 
addition is also the operation of multiplication which is 
designated by the symbol ®. We can multiply any (p, q)- 
and (r, 5)-tensors to obtain as a result a (p + r, + 5)-ten- 
sor. On components multiplication is defined by the formula 







(each component of a tensor T is thus multiplied by a ten¬ 
sor S) or, when tensors are interpreted as multilinear func¬ 
tions, by the formula 


{T®S) (Xi, . . ., Xp+r, IS , 1"+®) = 

= T (xi, .. ., Xp, IS ..., |«) 5 (xp+i, 


, Xp+r, I’^-S 



It is obvious that 0 -multiplication is distributive over 
addition 


{T + S)®R = T®R + S®R, 
R®{T-\-S) = R®T-\-R®S, 




Lecture 6 


59 


and associative: 

(r ® 5 ) ® i? = r ® (5 ® i?). 

But in general it is noncommutative: 

T ® S ^ S ® T, 

If one (or both) of the cofactors is a ( 0 , 0 )-tensor, i.e. a num¬ 
ber k, then the tensor product coincides with the ordinary 
one: 

k®T = T®k = kT. 


Under + and ® operations all vector spaces Tp {T) 
constitute an algebraic object that is an example of the so- 
called twice graded algebra. This algebra is designated by 
the symbol T(5^) and called the tensor algebra of a vector 
space T- 

Let as always ei, . . be an arbitrary basis of a space 5^* 
and e^, . . a conjugate basis of a space 
For any composite indices a = (ii, ...» ip) and p = 
= (/i. • • M iq) we set 

e^i (g) ... 0 0 ... 0 

Then 



and similarly 



.., Xp) = xV , . . x]v^X 


a 



for any vectors Xi, . . Xp and any covectors 
Therefore for any numbers we have 



(^ae" 0 ep) (Xi, 




KJ / 


f) = Tlx 



This proves the following proposition. 

Proposition 1 . All possible tensor products of the form 


0 ep — 0 ... 0 eV 0 0 ... 0 


60 


Semester 2 


constitute a basis of a space {f ). The coordinates of the 
tensor T in this basis are its coefficients: 


T = ® ... (8) e'p (g) eji (g) .,. (g) = 

= ® eg. □ 

For the case of bilinear functionals we already know this 
proposition from the preceding lecture. 

In particular we see that 


dim Tp( 5 ^) = 


so that the dimension of a space of (p, g)-tensors equals, as 
was to be expected, the number of their components. 


Let T be an arbitrary (p, g)-tensor, where p > 0 and 
g > 0 , and let 1 ^ ^ p, 1 ^ Z ^ g. On substituting in 

the tensor T a vector of a basis for the fcth vector argu¬ 
ment and a covector for the Zth covector argument and 
carrying out summation over i (from 1 to n) we obtain one 
new (p — 1 , q — l)-tensor. Thus 



• •» 








» ^P-1j 





), 


the right-hand side implying according to the Einstein 
convention summation over Z. The components of the tensor 
S are obviously expressible by the formula 


—‘ i’* * 

Definition 1 . The constructed tensor S is called a contrac¬ 
tion of a tensor T over the k\h subscript and the Zth superscript. 
It is necessary to verify that this definition is correct, 
i.e. that a tensor S is independent of the choice of a basis 
01, . . ., 6;^. But this is easily done. Indeed, if 
is any other basis and 



Lecture 6 


61 


then on replacing “noncontractible” arguments by dots, dots, 
dots we get 



» • • •» ^p-l» • • ♦ 

...,rs 

T (.. •, Ci *, .. •, e^ , ,..) 

— ^ , e^, .••,e , 

—■ 7 * (, •., e^, •. •, e^^*, ...) = 

=-5(xi, ...,Xp_i, 


□ 



Examples of contractions. 

1. On contracting a mixed bilinear functional B {x, = 

= over the only subscript and the only superscript 

we obtain a (0, 0)-tensor, i.e. a number B (e^, c^). This num¬ 
ber is called the trace of the functional and designated by 
the symbol tr B, Thus by definition 

\TB = b\ = b\+,..+bl, 

whence we see that the trace of a functional is equal to the 
trace of its matrix^ i.e. to the sum of the diagonal elements of 
the matrix. 

2. In particular, for any vector x and covector ^ 

tr (? (g) x) = lix'^ = l^x^ + ... + l^x'^. 


3. Let T be an arbitrary (p, g)-tensor. By taking p vectors 
Xi, . . ., Xp and q covectors . . ., we can construct a 
{P + <1^ P + g)-tensor 


Xi (g) ... (g) Xp ® r (g) (g) ... ® 1^. 

On contracting the tensor p q times over the subscripts and 
superscripts with the same numbers we obviously obtain 
a number 



Oq . 




Oq' 


i.e. the value of the tensor T on the vectors xi, . . ., Xp 
and covectors . . ., 




62 


Semester i 


Of particular interest are {p, 0)-tensors also called multi¬ 
linear functionals in a vector space The number p of 
arguments is called the degree of a functional. 

For a chosen basis ei, . . of a space T every multi¬ 
linear functional A of degree p is uniquely determined by 
its coefficients 


■^i\ .. .ip (®ii» * * * ? ®ip) 

using either of the two equivalent formulas: 

* * * » ... 


or 


If we fix in a functional A all arguments but one, the 
result is a functional of degree 1 , i.e. a covector. 

Definition 2. Every such covector is called a covector 
associated with a multilinear functional A, 

To obtain an arbitrary associated covector ^ it is neces¬ 
sary to give p — 1 vectors a^, . . ., ap_i and a number i. 
The covector g is then given by the formula 

^ (x) A (a^, . . ., a^^j, X, a^, • • ftp-i)* 

Definition 3. The subspace of a space f generated by all 
covectors associated with a multilinear functional A is 
called the rank space of that functional. 

Definition 4. A multilinear functional A of degree p is 
said to be expressible in tensor form in terms of covectors 

if it is a linear combination of tensor products 

of the form ® ® where 1 ^ 7 i, . . /p ^ r. 

Proposition 2. Any multilinear functional A is expressible 
in tensor form in terms of every basis of its rank space. 

Proof. Let M be the rank space of a functional A and 
e^, . . ., e** be its arbitrary basis. Supplement this basis to 
obtain a basis 


of the whole space 5^' and consider a conjugate basis 


Cl, • • •) Cft 


• • 


•» 




Lecture 6 


63 


of a space = {f ')'• As we know (see Lecture 4), the 
vectors 

®r + l» • • •» 

constitute a basis of the annulet of a subspace ^ and 
hence 

? (e^) = 0 when / > r 

for any covector S 6 But contained among the covectors 
of M are in particular all covectors of the form 

^ (x) = A. (x, 1 ®ip)* 

For any indices ig, . . ip and any index > r we have 
therefore 

A ^12’ * * * ? ®ip) 

i.e. 

A- • • —0 

^X\X2 • • •Xp - 

It can be proved in a similar way that this last equation 
holds not only for ii > r, but also for ig > r and in general 
whenever > r at least for one /c = 1, 2, . . p. But then 
we may consider that in the expansion 

A = Ai^,,,ipen (g) ... (g) 

summation over ail the indices takes place only from 1 to r, 
and this precisely means that in tensor form a functional A 
is expressible in terms of a basis e^, . . e^. □ 



Lecture 7 


The rank of a multilinear functional • Functionals and per 
mutations • Alternation 


Let us continue the study of the rank space of a multilinear 
functional. 

Let .4 be a multilinear functional and M its rank space. 
Further let be an arbitrary family of covectors in 

terms of which the functional A is expressible in tensor 
form. 

Proposition 1. The subspace ^ is contained in the linear 
span of covectors 

M cz [gS . . ^1. 

Proof. Under the hypothesis we have 

where are some numbers, and summation over 

ii, . . ., ip takes place from 1 to r. 

An arbitrary covector 

^ (x) A (H]^, • • X, a^, . . ap_j^) 

associated with the functional A can therefore be expressed 
by the formula 

I = Cal^ 


where 


Cq — fcii. . . is^iqis . . . ip-1^1^ 


. . . ajpji. 


Consequently I G and hence ^ cz □ 











Lecture 7 


65 


Definition 1. The dimension of the rank space M is called 
the rank of a multilinear functional A, 

Theorem 1. The rank of a multilinear functional A is equal 
to the smallest number of covectors in terms of which the func¬ 
tional A can be expressed in tensor form, i,e. 

(a) if the functional A is expressible in tensor form in terms 

of covectors 5^, then its rank does not exceed r; 

(b) if r is the rank of the functional A, then there exist r 

covectors in terms of which the functional A is 

expressible in tensor form. 

Moreover, the family of covectors possesses the 

property indicated in (b) if and only if it is a basis of the rank 
space M. 

Proof. According to Proposition 2 of the preceding lecture, 
in tensor form the functional A can be expressed in terms of 
a basis of the space This proves, in particular, state¬ 
ment (b). 

If, on the other hand, the functional A can be expressed 
in tensor form in terms of covectors and there¬ 

fore, according to Proposition 1, we have the inclusion 

cz ..., in, 


then 

dim ^ ^ dim [l^, . . ., ^ r. 

This proves statement (a). 

In addition we see that when r = dim 31 there must neces¬ 
sarily hold the equation 

m = [gs ..., in 

showing that the family of covectors l^, . . ., l^ (obviously, 
linearly independent) is a basis of the space 

This completes the proof of all the statements of Theo¬ 
rem 

It goes without saying that all this remains valid (with 
obvious modifications) for functionals of covectors (( 0 , p)-ten- 
sors). One should only keep in mind that it is vectors that 
are associated with such functionals, so that the rank space 
turns out to be a subspace of a vector space T. 


5-01325 



66 


Semester 2 


Recall that a permutation of degree p is an arbitrary bi- 
jective mapping of a set {1, onto itself. Any such 

permutation a is usually represented by a two-row array 

/I 2 ... p \ 

o„(2) ... o{p)}’ 

although in general the lower row alone would be quite 
enough. 

All permutations of degree p form a group (under compo¬ 
sition) which is called a symmetric group and designated by 
the symbol Sp. 

Permutations are divided into even and odd ones accord¬ 
ing to the number of pairs (a (r), a (/)) for which i < / 
but a (i) > a (/) is even or odd. 

The sign of permutation is the number -f 1 if the permuta¬ 
tion is even and the number —1 if the permutation is odd. 
We shall designate the sign of a permutation o by the 
symbol Sg. 

It is known that 


for any two permutations a and t, from which it follows in 
particular that all even permutations constitute a subgroup of 
the group Sp. 

Let A be a multilinear functional of degree p. 

Definition 2. For any permutation o ^ Sp the symbol oA 
stands for a functional given by the formula 

(oA) (x^y • • •? Xp) ^O'(p))* 

It is clear that 


• •‘Mp)* 

In order to obtain the coefficients of a functional aA it is 
thus necessary to apply a permutation a to the indices of the 
functional A. 

Example. If « = 5, p = 3 and 







Lecfure 7 


67 


then 

(a4)l45 = - 4514 , (^-4 )553 = -4 355. 

It is obvious that for any permutation a the mapping 

A h-^ G A 

is a linear mapping (homomorphism) of a vector space Tp (T') 
onto itself. Moreover, as can easily be seen, 

(ot) .4=0 (xA) 

for any permutations o, t 6 from which in particular 
it follows that the mapping A ^ oA is an isomorphism. □ 

From now on we shall assume that the ground field K has 
the characteristic 0 , i.e. it is possible to divide in it by any 
natural number (and in particular by the factorial p!). 

Definition 3. For any functional A 6 Tp (^) the symbol 
Alt A designates a functional defined by the formula 

Alt.4 = -^ 2 ^oipA). 

a£Sp 

It is clear that the mapping 

A Alt A 

is linear (is a homomorphism). It is called an alternation. 

Since Alt: Tp {T) -^Tp {T) and o: Tp {T) -^Tp (F), 
the composition mappings o o Alt and Alt o o are well- 
defined. 

Proposition 2. For any permutation a ^ Sp there are rela¬ 
tions 

Alt o a = Ea Alt, a o Alt = Alt. 

Proof. For every functional A 6 Tp (5^) we have 

Alt((TA)=-|p 2 ex(wA) = 

x6Sp 

= Zc 2 ®xa {XOA) = 

^ xesp 

= S e^('^^)==ea Alt4, 

x6Sp 


5* 






68 


Semester 1 


for Tcr runs simultaneously with t over the whole group Srj. 
Similarly, 

aAlt^ = -i 2 St = 8o ^ 2 ^ox{°'^A)=taA.\iA. □ 

x^Sp x£Sp 

Since Alt: Tp (T) ->Tp {T') the iteration 

Alt o Alt: Tp (T) ^ Tp (T) 

is well-defined. 

Proposition 3. The following equation holds 


Alt o Alt Alt. 


Proof, By linearity of alternation and Proposition 2 

Alt (Alt A) = Alt (A S M)) = 

aeSp 

= ■^3 = A 2 (eo)^Alt A = Alt A. □ 

a6Sp cf^Sp 

How can the coefficients (Alt of the functional 

Alt A be expressed in terms of the coefficients Ai^, ,i of 

the functional A? It is appropriate to introduce a compact 
notation for these formulas that may be useful for other 
purposes. 

Let be n^ given numbers with indices ii, ,,,, ip 

varying from 1 to n. Associate with them other n^ numbers 
similarly indexed and defined by the formula 

Bii,..ip pi 2 • •’^aCp)* 

oesp 


Allowing for a certain degree of inaccuracy in the formulas, 
numbers are usually denoted by AThus by 

definition 


A[ii...ip]=A 2 


aGSp 


^a(i) * * ‘^aCp)’ 




Lecture 7 


69 


Of course the position of indices does not play any role in 
this notation. If superscripts are used in denoting the given 

numbers: A'^"' then one accordingly sets 


^[il. . .ip] _ J__ 


p! ^ 
oesp 


Proposition 4. For the coefficients of a functional Alt A 
the following formula holds 


(Alt 4), 




Axa 


[il. ..ip] 


Proof, By definition 

(Alt^).^ .p-=AltA(eii, eip) = 


1 

“ 2 (®il’ * • * » ®ip) ~ 


agSp 


— dI S 


^a(i) 


aGSp 


’ • • • ’ ®*o(p)) ~ 


p\ !2 * * • ^a(p) D 

OGSp 

The significance of the notation introduced above is not 
exhausted by this formula. For example, it is convenient 
to use it in writing determinants. 

Lemma 1. The following identity holds 

x\ ... xl 

^2 • • ^2 —p \ ^[1 _ ^p] ^ p I ^ 

xl xP 


Proof, By definition 


a(l) 


p ! x[^ , ,, xP'^ = 2 

^ oeSp 


I 1 P X ^ 1 

p , X[\ . , , Xp^ = ®a^CT(l) 

a^Sp 




’a(p)' 


and 




70 


Semester 2 


In both cases the expression on the right is equal to the 

determinant | | (first expanded “by the rows” and then 

“by the columns”). □ 

Proposition 5. For any vectors Xi, . . Xp the following 
formulas hold 


(Alt^) (xi, 




r’P - 

• . JUp — 

„ip_ 

. Xp'^ — 


X 


ipl 


Proof. The first formula is but a different way of writing 
the statement of Proposition 4. The second is proved by 
computation: 


(Alt A.^ (^l7 • • • ? Xp) y- 2 (^O (D? • • • » ^o(p)) “ 


pi 


G£Sp 


— pj S , .ip^o(i) •• • ^a(p) — 

oesp 

~.. .ip 2 ^g^g(1) ••• 

G^Sp 

. .ip^il * • • ^p]’ 


The third formula follows from the second by virtue of 
Lemma 1. □ 

Corollary. The following formula holds 


(Alt>l)(xi, Xp)--^4i 


I ^*11.••ip 


x\^ 

x\^ 


•• 


x\p 


. Xp 


where as always summation over ii, . . ip is carried out from 
i to n. 





Ledure 7 


71 


Example. 


For p = 2 
(Alt.4)(x, y) 


= ( 

= ( 
= ( 


Ai}~Aji 


) icV = 
) = 




\ _ 


)= 




i/ 



Lecture 8 


Skew-symmetric multilinear functionals • External multi¬ 
plication • Grassman algebra • External sums of covectors • 
Expansion of skew-symmetric functionals with respect to the 
external products of covectors of a basis 


Suppose we are given two sets of n'^ numbers j and 
• 

with p indices ii, , , ,,i-p independently running from 

1 to n. On multiplying each of the numbers j by a 

• « 

corresponding number y^‘ "^p and adding all the products 
together we obtain a number 


^i\. 


ip 


yh- 


• ip 


Lemma 1. For any permutation a ^ Sp we have the identity 



X 


n 


ip 


yil, , Ap — 


a(i)* • 


y 


%(i) 


^a(p). 


Proof. Both sides of relation (1) are sums of the same 
but differently ordered terms. □ 

Suppose we are given n^ numbers x), where / = 1, . . w. 
Consider all possible products of the form 


Lemma 2. For any permutation g ^ Sp we have the identity 


(2) xVi) 

where t = 


Op 


^^(1) 


X 


'T(p)’ 




Lecture 8 


73 


Proof, Both sides of relation (2) are products of the same 
but differently ordered multipliers. For example, if p = 4 
and 

_/l 2 3 4\ _/l 2 3 4\ 

U 4 3 1 3 2j’ 

then 




□ 


Definition 1. A multilinear functional^ A is said to be 
skeiv-symmetric if 

a A = eaA\ 

for any permutation g ^ Sp, 

Proposition 1. A functional A is skew-symmetric if and 
only if 

(3) •'^o(p) 

for any indices ip and any permutation a ^ Sp, 

Proof, If A is skew-symmetric, then 





ip 


s>(jAi 




ip* 


Conversely, if relation (3) holds, then for any vectors 
Xi, . . ., Xyi we have 


(oA) (Xj, ..., Xp) A (xq^i), •.Xq^p)) — A 


n 




%p 

^a(p)* 


But according to Lemmas 1 and 2 and condition (3) 


^n- . .tp^a(l) • * * ^a^p) 


A* • T^oip) 

= Ax • — 

^a(i)* • • Vp)*^i ... Xp 

^ ^ .ip^P • . . ^p 
= ZqA (x^, . . ., Xp). 


Hence a A = BqA.Q] 


74 


Semester 2 


Proposition 2. A multilinear functional A is skew-symmetric 
if and only if it remains unchanged when alternated 

Alt ^ = A. 

Proof, If 

= ZqA 

for any permutation o 6 then the terms of the sum 

S ea(a^) 

crGSp 

are all equal to A, and therefore this sum is equal to p\A. 
Hence 


Alt A = A. 

Conversely, if Alt A = A, then according to Proposition 2 
of the preceding lecture 

gA = o Alt A = Eg Alt A = SgA. □ 

Corollary. A multilinear functional A is skew-symmetric 
if and only if for its coefficients the following equations hold 

.. .ip . ,ip]* 

A formally somewhat more general condition of the skew- 
symmetry of a functional is given by the following prop¬ 
osition: 

Proposition 3. A multilinear functional A is skew-symmetric 
if and only if there exists a multilinear functional B such that 

(4) A|'=fAlt B. 

Proof, If A is skew-symmetric, then (4) holds iox B = A 
(Proposition 2). Conversely, if (4) holds, then according to 
Proposition 3 of the preceding lecture 

Alt A = Alt (Alt B) = AltB = A 
and hence (Proposition 2) A is skew-symmetric. □ 

A tensor product A ® J? of two skew-symmetric function¬ 
als will not in general be a skew-symmetric functional. To 
turn this product into a skew-symmetric functional it i^ 
pecessary to alternate it. 





Lecture 8 


75 


Definition 2, An external product A /\ B oi skew-symmet¬ 
ric functionals A and B is the functional 

A /\ B = Alt {A ® B). 

Its degree is equal to p + q, where p and q are the degrees 
of A and 5, and its coefficients are expressed by the formula 

{A f\B)^^^ , Aj}^q ~ • ’ip^ip+l *. •ip+q]' 

Proposition 4. External multiplication of skew-symmetric 
functionals is associative, i.e. 

{A /\B) = A A{B /\C) 

for any three skew-symmetric functionals A, B and C. 

By virtue of this proposition one may omit brackets in 
the external products of several functionals. 

We shall preface the proof of Proposition 4 with some 
remarks that'! are of interest in themselves. 

For any p and q we can map a symmetric group Sp into 
a symmetric group Sp^q by associating with an arbitrary 
permutation g ^ Sp sl permutation cr' g Sp+q acting on the 
numbers 1, . . p in the same way as cr and leaving the 
numbers p + 1, • • q fixed: 

r G{i), l<j<p, 

a'( 0=1 if 

I 0 P H“ 1 ^P "f* 

It is clear that the correspondence g ^ a' is a monomor¬ 
phism (an injective homomorphism) preserving the sign, 
i.e. such that 


- 8(j 


for any permutation cr. 

Applying permutations of the form cr 6 it is possible 
to have an arbitrary multilinear functional A of degree 
p q “alternated only by the first p arguments” i.e. to con¬ 
struct a functional 

aes^ 



76 


Semester 2 


Lemma 3. Alt (alt A) = Alt A, 

The proof of this lemma actually completely repeats that 
of Proposition 3 of the preceding lecture: 

Alt (alt ^) = Alt (-^ 2 8a(<T'^)) = 

(JGSp 

= -y 2 eafio'Alt A=Alt A. □ 

oesp 

We can now pass directly on to the proof of Proposition 4. 
Proof of Proposition 4. Let p, q, and r be the degrees of the 
functionals A , B and C, 

By definition 

{A ^B) l\C = Alt {{A t\B)® C). 

But it is clear that 

{A f\B)® C = oM {A ® B ® C), 

where alt designates alternation by the first p A- q indices. 
Therefore according to Lemma 3 

{A l\ B) f\C = A\i {A ® B ® C), 

We can similarly prove that 

A A (^lA Q = Alt{A ® B ® C), 

Consequently {A ^ B) /\ C = A /\ {B /\ C). □ 

We see in particular that 

A A 5 A C = Alt (A ® 5 ® C). 

It is clear that a similar formula holds for any number of 
multipliers. 

Unlike tensor multiplication, external multiplication is 
commutative, although up to a sign. 

Proposition 5. For any two skew-symmetric multilinear 
functionals A and B of degrees p and q the following equation 
holds 


B /\ A = A /\ B, 

This property of external multiplication is called skew- 
commutativity, 

\ 


a 






Lecture d 


11 

Proof, By definition 

{B 0 A) (x^, . . . , Xp+q) = B (Xj, • • • y Xg) A (Xg+4, • • • y Xp+q) — 
— A • • • y ^p+q) B (Xl» • • • y Xg) — 

— ^A ® J?) (Xqr+j, • • • y Xp+q, Xj, • • • y Xq) = 

“ (^0 ® ^)) (^1’ • • • » ^ p +^)» 

where 

_/ 1, . . ., p p + 1, . . ., p + q\ 

^ W + l? + 1? •••yQ / 

i.e. 

5 4 = (Jq ® 5). 

Therefore 

B j\A^ Alt (5 (g) 4) = Eao Alt {A® B) = Eoo {A/\B). 

To complete the proof it remains to note that 

eao^(-ir. □ 

It is clear that the set Ap (^) skew-symmetric 

functionals of degree p is a subspace of the space Tp (T*) and 
hence is itself a vector space. The operation of external 
multiplication of skew-symmetric functionals is obviously 
distributive over addition: 

{A+B)^C=A/\C+B/\C. 

This means that under -f and /\ operations the vector spaces 

Ao{r)y Ai(n..... Ap(n. 

constitute an algebraic object that is an example of what 
is called graded algebra. This is designated by the symbol 
A (^) and called the exterior algebra of a space f" (or its 
Grossman algebra). 

Note that for p = 1 the skew-symmetry condition imposes 
no restrictions. Therefore 

Ai(n = Ti(r) = r'. 

By similar considerations 

Ao {T) = To {T) = K. 


u 


Semester 2 


It follows from skew-symmetry in an obvious way that 
-4 (xi, . . Xp) = 0 if at least two vectors Xi, . . Xp coin¬ 
cide (recall the corresponding reasoning for determinants). 
Hence, by multilinearity, A (xi, . . ., Xp) = 0 if one of the 
vectors Xi, . . Xp is linearly expressible in terms of the others. 
Since for p > n this is always the case, we thus get 

Ap {T) = 0 for p> n. 

Of particular interest are external products of first-degree 
functionals, i.e. of covectors. 


According to the remark made above 

A • • • A = Alt ® ... (8) 

for any covectors . . ., 5^. This means that for any vec¬ 
tors Xi, . . Xp we have 

•••. *p) = 

2 ® • <S) {*o(l)> • • • I ^a(p)) ~ 

aeSp 

= ^ 2 ®aE‘(Xa(l)) ••• i*’(Xa(p)). 
aeSp 


I.e. 

(5) (|iA...A?’)(xi. 



(x,) . .. (Xp) 
IP (x,) ...IP (Xp) 


This—very important!—identity can be rewritten (see 
Lemma 1 of the preceding lecture) in the following equiva¬ 
lent form 


(l‘A---A?*’) (xi, 


Xp) = (X[l) 


;^(xp]). 


or in the form 



(1^^.../\IP) (x„ 


..Xp)=|l‘(X,) ... ?'’l(Xp). 


We now introduce the functional 




06Sp 



Lecture 8 


7 & 

Its value on the vectors xi, . . Xp is expressed by the for¬ 
mula 

(7) (gl‘ 0 ... (8» I**!) (xj, ..., Xp) = 

=-^ 2 0... 0 (xi,..., Xp)= 

oeSp 

= ■^3 (*l) ••• (Xp) = (Xi) ... ^P^(Xp). 

aeSp 

Comparing formulas (6) and (7) we obtain the following 
proposition: 

I^oposition 6. For any covectors we have 

^ = 1^^ ® . . . ® 


Corollary. A functional f\ f\V^ is in tensor form 
expressible in terms of covectors □ 

We now prove a simple but important proposition. 
Proposition 7. The equation 

A •. • A I" = 0 


holds if and only if the covectors . . ., 5^' are linearly depen¬ 
dent. 

Proof, By skew-commutativity of external multiplication 
the product ^ f\ , , . /\^^ changes the sign when any two 
multipliers are interchanged. By a now familiar reasoning 
it can be deduced from this that ^^ /\ . . . /\ = 0 if 

the covectors are linearly dependent. 

Let the covectors . . ., be linearly independent. Then 
they can be supplemented to obtain a basis 



•» 


e 


n 


of the whole space Let 


ei, 


• • *7 ® 7 l 



So 


Semester 2 


be a conjugate basis of a space 5^. Then according to for¬ 
mula (5) 

|‘(ei) ... 

(l‘A ••• A^^)(ei, •■.,ep)=^ . 

IP (e,) ... r (ep) 

1 0 

_ 1 
“ pi 

0 ’i 

Therefore A • • • A □ 

Let e^, . . be an arbitrary basis of a space f"'. Then 
every multilinear functional A allows, as we know, a rep¬ 
resentation of the form 

If the functional is skew-symmetric (and hence A = 
= Alt 4), then after alternating we obtain from this a 
formula of the form 

(8) A-A • • • A 

There are, however, many zero and identical terms in this 
formula. We should therefore “reduce similar terms” in it. 

According to Proposition 7 the terms in the sum (8) for 
which there are identical indices among the indices ij, . . ., ip 
are all equal to zero. Therefore 

(9) = ••• Ae^ 




where summation is taken over all p-member sets ig* • • • 
. . ., ip of integers 1 to w consisting of different numbers. 

On fixing one of such sets consider in the sum (9) the 
terms differing only in the order of their indices. There are p! 
such terms in all and each has the form 



(without summation) 





Lecture 8 


81 


where cr is an arbitrary permutation of degree p. But, as is 
immediate from the skew-commutativity of external mul¬ 
tiplication 

A • • • A A • • • A 

On the other hand, according to Proposition 1 
Since Ea&a = 1, all of the terms (10) are equal to 

A • • * A ® 

I_f_I 

(without summation) 

This proves that 

— pi 2 “^ii.. A * • • A 

(ii j • • •) ip) 

where summation is taken over all combinations (^i, .. ., i^) 
of indices in the sum (9). Since for every combination there 
exists a unique set ii, . . ., for which ii < ^2 < • » • < ^pi 
this proves the following proposition: 

Proposition 8. For any skew-symmetric functional A the 
following equation holds 

A=p\ ^ Ail ,.A • • • A ® n 

ii<...<ip P 

Thus functionals of the form 

®^‘ A • • • A ®^^» 1^< ... < ip^n, 

constitute a family complete in Ap (5^)- 


6-01325 


Lecture 9 


The basis of a space of skew-symmetric functionals • Formulas 
for the transformation of the basis of that space • Multi¬ 
vectors • The external rank of a skew-symmetric functional • 
Multivector rank theorem • Conditions for the equality of 
multivectors 


It follows from Proposition 1 of the preceding lecture that 
for the coefficients sl skew-symmetric functional A 

we have the equations 





• • • 


XJ) 


0 if there are identical numbers 
among the numbers ip, 

otherwise, 


where o is a permutation of degree p such that 


^o(l) ^o(p)* 


It follows that in order to completely reconstruct the func¬ 
tional A it is sufficient to know only those of its coefficients 
for which ij < . . . < ip> 

Definition 1. The coefficients Ai^ ..i for which < . . . 

. . . < ip are called the essential coefficients of a skew-sym¬ 
metric functional A, 

Proposition 1. For any numbers 

( 2 ) 

with indices ii < • • • < there exists a unique skew-sym¬ 
metric functional A the essential coefficients of which are 
these numbers. 


1 




L^ckire § 


83 

• _t « 

Proof, The uniqueness of the functional A has just been 
established. We should therefore prove only its existence. 
On determining numbers Ai^ i for all ii, . . ip by 

means of formulas (1) consider a multilinear functional 

(3) A 

It is clear that if the functional is skew-symmetric, then its 
essential coefficients are precisely the numbers (2). Every¬ 
thing will thus be proved if we show that the functional (3) 
is skew-symmetric. 

To do this it suffices, according to Proposition 1, to prove 
that for the coefficients of the functional (3) we have the 
relations 

( 4 ) 

where cr is an arbitrary permutation of degree p. And we can 
obviously assume without loss of generality that the indices 
ii, . . ., ip are all different (since otherwise both sides of 
formula (4) are equal to zero). 

But if the indices ij, . . ., ip are different, then by defi¬ 
nition 

. .ip * *^t(p) 

where t is a permutation such that ixd) < • . . < ix(p)- 
Similarly 

• -^aCp)" ^P^^p(a(l))- • '^Q(c(p)y 

where p is a permutation such that jp(a(i))< . . . < hioiv))- 
But the numbers ^x(p) nnd ip(o(i))? • • •? ^p(o(p)) 

are the same, since both the former and the latter are the 
indices d, . . ., ip arranged in the order of increasing. Con¬ 
sequently 

T (1) = p (a (1)), . . T (p) = p (a (p)), 
i.P X = per. Therefore = epCa and hence 

•^i,. . .ip • •^cy(p)* 

Theorem 1. The external products 

(5) /\ ... /\ e^p, i^i^< ,,, dip^n 

constitute a basis of a space /\p 



84 


Semester 2 


Proof, In view of Proposition 8 of the preceding lecture 
it is sufficient to prove that the functionals (5) are linearly 
independent. 

Let 

S A *** 

where iid . . . < ip are some numbers. According 

to Proposition 1 there exists a skew-symmetric functional A 
the essential coefficients of which are the numbers . 

According to Proposition 8 of the preceding lecture that 
functional can be expressed by the formula 

2 A •** A®^ 

ii< . . . <ip 

and hence is under the hypothesis equal to zero. But then 

,,ip~^ (®in • • • » ^ip) = 0* 

Therefore the functionals (5) are linearly independent. □ 
Corollary 1. The representation of a skew-symmetric func¬ 
tional A as 

^ A ••• Ae*P 

^1 ■‘C . • • ^ i p 

is unique, [J 

Corollary 2. The dimension of a space Ap equal to 

(6) dim Ap {T) = j . □ 

In particular we again see that 

Ap (^ ) ^0 P > n. 

Let us transform from the basis e^, . . ., e’^ to another 
basis: e^', . . ., e^'. If as always 




Lecture 9 


85 


Therefore (see Proposition 4 of Lecture 7) 
e’i /\ ... A «*” = Alt (e*i (g) ... (g) e*P) = 

= cl\, ,.. c?P]e*' 0 ... <S) e*p, 

Z P 

and hence 

e’i A • • • A e'P= pi S qi, ... A • • • A 

ii< . . . <ip ^ 

The number ... Ci ^ is equal (see Lemma 1 of the 

preceding lecture) to the minor 

/ • / •f\ 

^ • • • T 

• • 

I 4 • • • I •. 


• f 

of the transition matrix C = {c\) which is in the inter¬ 
section of the columns with the numbers < • • • <C 
and the rows with the numbers i[ <C ^ < ip- We can 

therefore write the obtained formula for the transformation 
of the bases of the space Ap (^) I'he following final form: 





ij< . . . <.ip 



m 9 



A • • * A 


where < • • • < i'p- 

The results obtained can all be transferred in a natural 
way to (0, p)-tensors, i.e. to multilinear functionals A\ 
. . ., .4 (5^, . . ., 1^') of degree p of covectors. The 

only difference is that the subscripts become superscripts 
and vice versa. In particular the coefficients of a (0, p)- 

functional A have the form A'^^ the basis of the space 
(J/) = (^') of all skew-symmetric (0, p)-functionals 

consists of external products 

A * * * A ^ip~ Alt (Oij ®ip) » 

where ii cz . . . <C ip, 

and the expansion of an arbitrary skew-symmetric functional 
with respect to this basis is given by the formula 

A = p\ S . A ••• Acip. 

tl ^ • • • ^,%p 



86 


Semester 2 


Definition 2. The external products 

Xi A • • • A Xp, Xi, . . Xp 6 

of p vectors are called multivectors of degree p or briefly p- 
vectors. 

When p = 0 multivectors are numbers (elements of a 
field K) and when p = i they are vectors. The set of all 
p-vectors of a space T* will be designated by the symbol 
Generally speaking, it is now a vector space (since 
a sum of two p-vectors may or may not be a p-vector). 

For external products Xi /\ . . . /\ Xp the same formulas 
hold as for external products |^ /\ . . . /\ |^' of covectors 
(which could by analogy be termed “multicovectors”). In 
particular 

(8) Xi A • • • A Xp == X[1 ® ® Xp] 

and 

(9) (x, A .. • A Xp) (|‘, ..., g'') = (Xfi) ... (x,,]) = 

(Xi) . .. (a;i) 

(xj) ... (Xp) . 

{Xp) . . . {Xp) 

for any covectors And (cf. Proposition 7 of the 

preceding lecture) the equation 

xi A • • • A Xp = 0 

holds if and only if the vectors Xi, . . x^ are linearly de¬ 
pendent, □ 

It is also useful to note that for any n vectors 

Xi = x\ei + , +x^en. 


we have 

» • • 




(10) Xj A * • • A Xn — 


(®i A • • • A ®n)* 









Lecture 9 


87 


Indeed, if the vectorsXi, . . are linearly dependent, then 
this last equation is obvious (there are zero w-vectors on 
both left and right). But if the vectors Xi, . . ., x^ are linearly 
independent (and therefore constitute a basis of a space 
then equation (10) differs only in notation from the “vector” 
analogue of formula (7) for the case p = n. 

When n = 3 formula (10) is identical up to notation with 
formula (3) of Lecture 13 in [1]. It is therefore natural to 
expect that p-vectors in the sense of Definition 2 actually 
coincide with p-vectors introduced in Lecture 12 of [1] 
(i.e. with classes of equivalent families of vectors) or in 
other words that the equation 

xi A • • • A xp = yi A • • • A yp 

holds if and only if the families Xi, . . ., Xp and yi, • . yp 
are unimodularly equivalent (cf. Proposition 2 of Lecture 12 
in [1]). It turns out that this is really the case. And since 
for linearly dependent families of vectors this is immediate 
from what has been said above, we may consider without 
loss of generality only linearly independent families of 
vectors. 

Theorem 2. For linearly independent families of vectors 
Xi, . . ., Xp and Yi, . ^ Yp the equation 

xi A • • • A Xp = yi A • • • A yp 

holds if and only if these families are unimodularly equivalent^ 


i.e. if 

yi^ 

: CjXj -)-•••+ CfXp, 

( 11 ) 

• • 

yp = 

- CpXj . -j- C^Kp^ 

where 


c\ ... cf 

( 12 ) 


. = 1 . 

c], .. . CP 


In one direction Theorem 2 immediately follows from 
formula (10). Indeed, if relations (11) hold then both families 





88 


Semester 2 


are bases of the same p-dimensional subspace. Therefore 
according to formula (10) 

(13) yi A • • • A yp = A (xi A • • • A 

where A is the determinant (12). It remains to note that 
under the hypothesis A = 1. □ 

The converse is significantly subtler. We shall preface 
its proof with some preliminaries. 

For skew-symmetric functionals, as well as for arbitrary 
multilinear functionals, the concepts of rank and rank space 
are defined. But of course for such functionals the analogues 
of these concepts making use of external multiplication 
instead of tensor multiplication are much more natural. 

^ (t A be an arbitrary skew-symmetric functional of degree 
p (of covectors, for definiteness). 

Definition 3. A functional A is said to be externally expres¬ 
sible in terms of vectors Xi, . . ., if it is a linear combi¬ 
nation of external products Xt^ /\ ... /\ where 1 ^ 

^ ii, . . ., ip ^r. Cf. Definition 4 of Lecture 6. 

The number r is said to be the external rank of a skew- 
symmetric functional A if it satisfies the following conditions: 

(i) there exists a family of vectors consisting of r vectors 
in terms of which the functional A is externally expressible; 

(ii) in the case where the functional A is externally expres¬ 
sible in terms of some family of vectors the number of vectors 
in that family is not less than r. 

It remarkably turns out, however, that these definitions 
are actually unnecessary since in fact a skew-symmetric func¬ 
tional A is externally expressible in terms of vectors Xi, . . ., x,. 
if and only if it is expressible in terms of them in tensor form 
(so that the external rank of the functional A really coincides 
with its rank). Indeed, if 

A ® ... ® Xip, 

then alternating this equation we get 

(14) A = A ••• A Xip. 

Conversely, if the last equation holds, then according to 
formula (8) 

A = • -^pxii^ ® ... ® Xip]. □ 









Lecture 9 


89 


Nevertheless the concept of external rank is not useless. 
It is clear indeed that the external rank of a nonzero skew- 
symmetric functional cannot be lower than its degree (for 
otherwise each term of the sum (14) would contain recurring 
multipliers). The same statement is true therefore also for 
the rank of the functional: 

Proposition 2. The rank r of a nonzero skew-symmetric func¬ 
tional A ^ (5^) is not lower than its degree: 


I_I 


We shall employ this important property many a time 
in what follows. 

As a first application we shall prove the following state¬ 
ment characterizing multivectors in the class of all skew- 
symmetric functionals: 

Proposition 3. A skew-symmetric functional A 6 (^) 

is a multivector if and only if its rank r is equal to its degree: 

p = r. 

Proof. If A = Xi /\ . . . /\ Xp, then obviously r ^ p. 
Therefore the equation r = p must hold in view of Prop¬ 
osition 2. 

Conversely, \i r = p then the functional A has the form 

A • • • A Xip, 

where Xi, . . ., Xp is a basis of its rank space. But then, as 
shown by the reasoning already repeatedly used above, the 
functional A is expressed by the formula 

^ = (x, A ■ • • A Xp) 

and hence by the formula 

^ = yi A • • • A yp. 

where yi = -p] Xi, y 2 = x^, . . yp = Xp. □ 

Corollary. The rank space of a multivector Xi A • • • 
. . . A Xp = 7 ^ 0 the linear span [xi, . . ., Xp] of the vectors 

Xj, . . ., Xp. 

We can now find the equality conditions for two multi- 
vectors. 



90 


Semester 2 


Proposition 4. For nonzero p-vectors 

Xi A • • • A Xp and Yi A • • • A Yp 

the following four statements are equivalent: 

(a) the given p-vectors are proportional, i,e, there exists a 
number fc ^ 0 such that 

yiA • • • A yp = * (xi A • • • A xp); 

(b) the spans of the vectors Xi, . . Xp and yi, . . yp 
coincide: 

[yi? . . .j ypl • • •» Xp], 

(c) the families Xi, . . ., Xp and yi, . . ., yp are\linearly equiv¬ 
alent, i.e. there hold equations of the form 

y, -- cjxi + . • . + cfXp, 


yp = 4xi -h . .. + cPxp, 


where 



^ 0 ; 


(d) the rank spaces of the p-vectors Xi /^^ . . . /\ Xp and 
yi A • • • A Yp coincide. 

Proof, Let 

yi A • • • A yp = *xi A • • • A Xp. 


Since the functionals yi A • • • A Yp A • • • A Xp 

are equal, their rank spaces are the same. Therefore ac¬ 
cording to the corollary of Proposition 3 


[yi. • • M ypl = [*xi, xa, 



But clearly 



X2, 




and hence by the same corollary the rank spaces of the func¬ 
tionals kxi A *2 A • • • A Xp and Xi A • • • A Xp are equal. 
This proves that (a) =y (b) and (a) (d). 









Lecture 9 


91 


The equivalence of conditions (b) and (c) is obvious (and 
was already noted by us in Lecture 1). 

If (c) holds, then the multivectors Xi /\ ... /\ Xp and 
yi /\ . . . /\ yp are connected by relation (13) and hence 
(a) holds. 

Finally, since the basis of the rank space of the functional 
Xi f\ ... /\ Xp consists of the vectors Xi, . . ., Xp (d) im¬ 
plies (c). 

Consequently, conditions (a), (b), (c) and (d) are all 
equivalent. □ 

Corollary. There is a natural bijective correspondence be¬ 
tween classes of proportional nonzero p-vectors and p-dimen- 
sional subspaces of a space T*. In this correspondence to each 
subspace SI there corresponds an external product Xi /\ . . . 
... /\ Xp of vectors of its basis Xi, . . ., Xp and to each p-vec- 
tor Xi /\ . . . f\ Xp there corresponds a subspace [xi, . . ., Xp]. 

Theorem 2 can now be proved without difficulty. 

Proof of Theorem 2. We have already proved that if there 
hold relations (11) together with equation (12) then Yi /\ . . . 

• • • A yp Xi A • • • A Xp. Conversely, if yi A • • • AYp = 
= Xi A • • • A Xp then according to Proposition 4 there 
hold relations (11) and hence equation (13) with A = 1. □ 




Lecture 10 


Cartan's divisibility theorem • Pliicker relations • The Plii- 
cker coordinates of subspaces • Planes in an affine space • 
Planes in a projective space and their coordinates 


The criterion established by Proposition 3 of the preceding 
lecture that the skew-symmetric functional is a multivector 
is ineffective in practice. To obtain a more convenient criter¬ 
ion it is necessary to previously prove the following state¬ 
ment known as E, Cartan's theorem on divisibility (the di¬ 
visibility of a skew-symmetric functional by a multi-vector 
is implied). 

Proposition 1. Let Xi /\ . . . /\ = 7 ^ 0. For a skew-sym¬ 

metric functional A of degree p ^ r, there is a skew-symmetric 
functional B of degree p — r such that 

4 = 5 /\ Xi A • • • A Xr, 

if and only if 

fl) 4 A Xi = 0, . . ^ A Xr = 0. 

Proof, If A = B /\ Xi /\ ,.. /\ Xr, then for any s = 
= 1, . . ., r the external product 

A A Xs = 5 A Xi A • • • A Xr A X. 

contains two multipliers Xs and is therefore zero. 

Conversely let relations (1) hold. Since the vectors Xi, . . . 
. . ., X;. are under the hypothesis linearly independent, they 
can be supplemented to form some basis 


©1 . . .j Xyj C;.-!-!, 


• • • j 







Lecture 10 


93 


of a space f*, Let^ 



A-^p\ 


V 

LJ 

l^ii< . . . <ip^n 




be an expansion of the functional A with respect to the cor¬ 
responding basis of a space /\p {f‘) (see Theorem 1 of the 
preceding lecture). Then for any 5 = 1, . . r 

P* S ^ A • • * A A 

l^ii< . . . <ip^n 


If 5 is equal to one of the numbers ip, then A • • • 

. . . A A ~ 0. In the sum for A A Xg we therefore 

may restrict ourselves to the terms for which all indices 
ii, . . ip are other than 5 : 


(3) = . S . A ••• A eip A 

ii< . . . <ip 
2 , • • •. 2 p'V” s 

But when ii, ip are not equal to s, all multivectors of 
the form 


®2i A * * • A ®2p A ^ ^1 


are, as we know, linearly independent. Therefore, if -4 A Xs= 

• • 

= 0, then all the coefficients " V in formula (3) are zero. 

This proves that when conditions (1) hold only those coef- 
* • 

ficients 4^'‘ ‘V in the expansion (2) may be nonzero for 
which there is every index 5 = 1, . . ., r among the indices 
< . . . < ip, i.e. such that i^ = i, . , ij. = r. Therefore 

A=B/\GiI\,.» a 


where 

5 = (_l)’-(p-r)p! 2 A ••• 

2’<2r+l< • • • <^ 2 ^ 

• • • A ®ip* Q 

Proposition 2. A skew-symmetric functional A G {T*) 
is a multivector if and only if 

A l\x = 0 

for any associated vector x. 




94 


Semester i 


Proof, If >1 = Xi /\ . . . /\ Xp, then since the vectors 
Xi, . . ., Xp generate a rank space a vector x is linearly expres¬ 
sible in terms of them and hence 

A Ax = xiA-- - AxpAx = 0- 

Conversely, if 4 /\ x = 0 for any associated vector x, 
then 4 /\ X = 0 also for any vector x of the rank space. 
In particular, if Xi, . . ., X;. is a basis of the rank space, then 

A /\ Xi == 0, ,,,, A /\ Xr = 0. 

Consequently, according to Proposition 1 there exists a 
functional B of degree p — r such that 

^ = J? A Xi A * * • A 

Of course, this is possible only for p ^ r. Since always 
p ^ r (Proposition 2 of Lecture 9), this proves that p = r, 
i.e. (Proposition 3 of Lecture 9) that the functional A is 
a multivector. □ 


By virtue of skew-symmetry any vector associated with 
a functional A is given (as an element of a space (?^')') 
by the formula 

x: A (g, . . ., p^’), 

where p^, . . ., p^^ are some fixed covectors, and therefore 
have the coordinates 


ft? 

X — ^ . . . [jjp. 

Hence the coordinates (coefficients) of the functional 
-4 A X = Alt {A ® x) are expressed by the formula 

{A A x)^^^*‘' = A^^^ ' _ 

• • ^P^^P+l] h • • 

These expressions are equal to zero for all p^,, . . . 
and only if 

• • • ip^^p+l] h • • . ip _ Q 

for all ii, . . ip, ip+i, /a, . . jp (if for examplfc 

^^p+iJ 4 ^ 5 ^: 0 , then A*)*' 


• Pfp- 

PL « 







Lecture 10 


§6 


Pig == 6 j 2 , . . Pj^ = 6 ^p). Denoting for the sake of 

symmetry the index ip+i by we see that we have proved 
the following proposition. 

Proposition 3. A skew-symmetric functional A £ {T) 

is a multivector if and only if its coefficients satisfy the relations 

^[ii. . j2* . Jp _ Q 

for all ij, . , ip, f-^^ f^^ • • /p« Q 

Relations (4) are known as Plucker relations. 

Example. For p = 2 relations (4) have the form 

h _ 0 , 


i.e. the form 







By virtue of skew-symmetry the first term is equal to the 
forth, the second to the fifth and the third to the sixth. There¬ 
fore, reducing similar terms and cancelling by 2 we obtain 
the relation 


If ii = ig, then the first term is equal to zero and the other 
two have different signs. In this case therefore relations (5) 
hold automatically (by skew-symmetry). The situation is 
similar when any two of the indices ii, i^^ fi, are equal. 
Relations (5) are therefore essential if and only if all these 
four indices are distinct. 

Since for « = 3 this is impossible, it follows that all 
Plucker relations are trivial for n = 3 (and p = 2), i.e. in 
a three-dimensional space any skew-symmetric functional is 
a bivector. This explains why in [1] we managed to convert 
a set of bivectors into a vector space for n = 3. 

For n = A there is only one nontrivial Plucker relation: 

^12^34 ^23^14 ^ ^13^42 _ Q. 


In this case therefore bivectors do not constitute a vector 
space (a sum of two bivectors is not in general a bivector). 




96 


Semester 2 


It can be shown in a similar manner for any n that if 
p = n — 1 all Pliicker relations are trivial, i.e. that any 
skew-symmetric functional of degree n — \ is an n — i-vector. 

The essential coefficients *“ 1 ^ < . . . < ^ 

^ w, of the functional obviously have the form ••• t • •• 

where the sign "" over the index means that that index 
must be dropped. It is convenient to designate the indicated 
coefficient by the symbol (—1)^ 

Remark. Since numbers Bi can be interpreted as the coor¬ 
dinates of some covector, we see that there is a bisection 
between n — 1-vectors and covectors (i.e. 1-covectors). It 
turns out that for any p there is a similar correspondence 
between p-vectors and n — p-covectors. It depends in general 
on the choice of basis e^, . . ., e,i, but this dependence is 
rather weak. That is, this correspondence turns out to be 
the same for all unimodularly equivalent bases, i.e. those 
that determine the same w-vector: 

Eq = e^ /\ • . . /\ Bji, 

In this correspondence, to every n — p-covector B = , 

. . . A there corresponds a p-vector A defined as a 
skew-symmetric functional by the formula 

.?^)-=£o(PS .•.,P""^ 


Irrespective of the w-vector Eq this correspondence is deter¬ 
mined only up to proportionality, i.e. between the classes 
of proportional p-vectors and n — p-covectors. Identifying 
these classes with subspaces of spaces f* and T' (see the 
corollary of Proposition 4 in the preceding lecture) leads to 
a correspondence which associates with each subspace 
^ cz ‘T its annulet 3^°. 

In a Euclidean space, as will be shown in due course, it 
is possible to identify covectors with vectors and hence 
n — p-covectors with n — p-vectors. It follows that there 
is a bisection between p-vectors and n — p-vectors in an 
oriented Euclidean space. For w = 3 and p — 2 this corre¬ 
spondence coincides with that introduced by Definition 4 
of Lecture 15 in [1]. 

Unfortunately, we have no possibility to go deeper into 
these most interesting questions. 





Lecture 10 


97 


It goes without saying that iov p = n Pliicker relations 
are also trivial. This, however, follows also directly from 
the fact that according to Theorem 1 of Lecture 9 (or, more 
precisely, its “vector” analogue) the space (5^) is one¬ 
dimensional and is generated by any w-vector Ci /\ . . . 
• • • A = 7 ^ 0* Every skew-symmetric functional of degree 
n therefore is an n-vector of the form ae^ A • • • A ®n« 
Thus, 

and {T) = {T). 

Besides, of course, 

A0(r) = A®(^) = f< and {T)=T. 

We now apply the results obtained to the geometry of 
space. 


Definition 1, A nonzero p-vector A is said to be a direction 
p-vector of a p-dimensional subspace ci if ^ is its 
rank subspace. Vectors a: 6 ^ are also said to be parallel 
to the p-vector A (the notation is x || A). 

According to the corollary of Proposition 4 of the preceding 
lecture every direction p-vector is an external product 
Xi A • • • A Xp of vectors of some basis of the subspace ^ 
and is therefore, up to proportionality, uniquely determined 
by the subspace (and of course uniquely determines it). 

Definition 2. The coordinates of an arbitrary 

direction p-vector in a p-dimensional subspace cz T' are 
called the Pliicker coordinates of the subspace. They are 
determined (for a fixed basis of the space T) up to propor¬ 
tionality, i.e. are homogeneous coordinates. 

In terms of an arbitrary basisxi, . . ., Xp of the subspace 
its Pliicker coordinates are expressed by the formulas 



X 


ipl 

P 


or equivalently by 




X 


ip 

1 



• • • 



1 


where p is an arbitrary factor of proportionality. 


7-01325 



Semester i 


98 


For a set of numbers *“ V to be a set of Pliicker coordi¬ 
nates of some p-dimensional subspace it is necessary and 
sufficient that 

(i) the numbers A ‘ * V should be skew-symmetrically 
dependent on the indices, i.e. that for any permutation 
o ^ Sp there should be an equation 


•••. 


(ii) for any indices ii, . . fp, ji, , . jp the Pliicker 
relation 

ji... jp __ Q 


should hold. 

This assertion is but an obvious restatement of the results 
we already know. 

Let be an n-dimensional affine space (see Lecture 5 
in [1]) and let be an associated vector space. In complete 
analogy with the definitions of a straight line and a plane 
(see Lectures 5 and 6 in [1]) we introduce the following defi¬ 
nition. 

Definition 3. For any point Mq 6 and an arbitrary 

nonzero p-vector ^4 ^ (5^) the set of all points M ^ 

-> 

for which M^M || A is called a p-dimensional plane passing 
through the point Mq and parallel to the p-vector A, The 
p-vector is also called a direction p-vector of the plane. 

When p = 1 the plane is called a straight line (cf. Defini¬ 
tion 7 of Lecture 5 in [1]), and when p = n — 1 it is called 
a hyperplane. 

When p = 0 planes are points in a space 

If = ai /\ . . . /\ Bp and a point 0 is chosen in the 

-> 

space then the condition MqM |( A is equivalent to the 
equation 

(6) X = Xq -j- -|- . .. -f" ^^Bp, 

where Xo = OMq and x = OM are radius vectors of the 
points Mo and M and t^, . . ., t^ are arbitrary numbe 
Equation (6) is called the parametric vector equation of a pla 





Lecfure Id 


9d 

In an arbitrary affine coordinate system Oei . . . the 
vector equation (6) is equivalent to n numerical equations 

+ t^a\ + ... + t'^ap, 

(7) . 

— + • • • + 

r 

which are called the parametric {coordinate) equations of a 
plane. 

In order to give a plane it is possible to use, instead of 
a direction p-vector A, the corresponding subspace ^ cz T* 
consisting of vectors parallel to the p-vector A (i.e. con¬ 
stituting its rank space). The vectors of are said to be 
parallel to the plane considered. 

Equation (6) means that a point M with radius vector x 
lies in the plane if and only if x — x© I*©* If the vector x 
belongs to the coset Xq + of the space T modulo the 
subspace This justifies the following definition. 

Definition 4. A subset of a vector space T is said to be 
a linear manifold if there exists (obviously unique) a sub¬ 
space a f' the coset modulo which is fi. The dimension 
of ^ is called the dimension of the linear manifold (i. 

We can thus say that p-dimensional planes of an affine 
space Jh are precisely those of its subsets which become 
p-dimensional linear manifolds of the space f" under the 
bijective mapping 

M •—> X = OM. 

In other words, the choice of a point O ^ Jh allows the 
planes of the affine space Jh to be identified with linear 
manifolds^ of the vector space 5^. 

We know (see Lecture 4) that subspaces ^ (z: IT can be 
given as the annulets of families of covectors 5”^ 6 ^ 

i.e. by conditions of the form 

(x) = 0, . . ., r (x) = 0. 


It follows that a vector x 6 belongs to the linear manifold 
Xo + ^ if and only if 

(8) (x) = . . ., r (x) = 


7 * 



loo 


Semester 2 


where (xq), . . 6”* = (xq). This means that 

equations (8) characterize radius vectors x of points in the 
corresponding plane of the space 

In coordinates these equations have the form 



ll /y»l 






•m 


^^4- • •. + 


n •*' 



i.e. form a system of nonhomogeneous linear equations. 

This proves the following theorem. 

Theorem 1. For any system of linear equations (9) the points 
of a space A {the vectors of a space T) whose coordinates . . . 

, , ,, x'^ satisfy the system constitute, if they exist, some plane 
{linear manifold), it being possible to obtain any plane {any 
linear manifold) in this way, □ 

Equations (9) are called accordingly the equations of a 
plane {of a linear manifold). The dimension of the plane is 
n — r, where r is the rank of the matrix of the coefficients 
of system (9) (see Theorem 2 of Lecture 4). 

A change from the equations of a plane (9) to its parametric 
equations (7) means in algebraic terms the finding of a general 
solution of system (9) (which is effected by the method we 
know; see Lecture 2) and a change back from equations (7) 
to equations (9) means setting up equations whose general 
solution is of the form (7). 

Since vectors ai, . . Up are under the hypothesis linearly 
independent, the matrix 

/ a» ftp \ 


\ft^ ,.. ftp/ 

of their coordinates has a nonzero minor of order p. Regarding 
the corresponding equations (7) as equations in t^ 

we can express t^, , , ,, t^ in terms of x^ using Cra¬ 

mer’s formulas. On substituting then the obtained expres¬ 
sions in the remaining n — p equations (7) we get for a:^, . . . 
. . ., x'^ precisely equations of the form (9) (with m = 
= n — p). 






Lecture 10 


101 


All this means that the “geometric” theory of planes in 
an w-dimensional affine space is completely equivalent to 
the “algebraic” theory of systems of nonhomogeneous linear 
equations in n unknowns. Both theories speak of the same 
things, but in different languages. It is necessary to learn 
to translate without difficulty from one language to the 
other. 

Example. The fact that a system of equations (9) has a 
unique solution means that the corresponding plane has di¬ 
mension 0 and is a point in the space Jk- The subspace 
a T corresponding to it consists of only the zero vector 0 
in this case. Consequently the system of homogeneous equa¬ 
tions 








has only one trivial (zero) solution (0, . . ., 0). Conversely, 
if system (10) has only a trivial solution, then the subspace 
^ it defines consists of only the vector 0. Every coset x + 
therefore consists of only the vector x and hence equations (9) 
have a unique solution. Thus we see that the (compatible) 
system (9) of nonhomogeneous linear equations has a unique 
solution if and only if system (10) of homogeneous linear 
equations has only a trivial solution. 

The geometrical fact, equivalent to this algebraic state¬ 
ment, is simply that if ^ = 0, then Xq + = Xq for any 

Xo G and conversely. 

Similarly a vector x belongs by definition to the coset 
Xo + if and only if it is of the form Xq + a, where a 6 
In “algebraic terms” this means, firstly, that the sum of 
some fixed solution of system (9) and an arbitrary solution 
of system (10) is a solution of system (9) and, secondly, that 
any solution of system (9) can be obtained in this way. 

The various situations in relative positions of planes in 
the space Jh can be algebraically characterized say by the 
conditions on the ranks of some matrices ^'and their sub- 

I 

matrices. We shall not go into this since it is sufficiently 
dull and at the same time extremely awkward. 




102 


Semester 2 


The awkwardness of the theory of planes in an affine space 
is due (at least in part) to the existence of parallel planes. 
It is therefore natural that in a projective space this theory 
becomes somewhat easier (although remaining sufficiently 
complicated). 

A general definition of an n-dimensional projective space 
over an arbitrary field K was given in Lecture 26 of [11. 
According to the definition one of the models of the space is 
a set of all one-dimensional subspaces of an w + 1-dimen¬ 
sional vector space Instead of we can of course 

take any n + 1-dimensional vector space and hence 
any n -f 1-dimensional affine space A = with a point 
O marked in it. In the last variant the points of the resulting 
model of a projective space are straight lines of the space A 
passing through the point O, i.e. we obtain the “bundle” 
model we already know for the case n = 2 (see Lecture 25 
in [1]). 

For definiteness we shall consider the model P” {TT) 
whose points are one-dimensional subspaces of an w -f 1- 
dimensional vector space T, 

Definition 5. A plane of dimension r in a projective space 
pn |g ^ points of the space that are one-dimen¬ 

sional subspaces of some r + 1-dimensional subspace M czT. 

Thus every r-dimensional plane is by definition an r-dimen- 
sional projective space P'’ (,^). 

Allowing certain inaccuracy (but attaining in return brev¬ 
ity ofj expression) one ordinarily says that the planes of 
the space P^(f^) are the subspaces of the space f" (of a di¬ 
mension higher by unity). Thus, for example, already in 
Lecture 25 of [1] straight lines of the model Sf^o we iden¬ 
tified with planes in an affine space A passing through a 
point O (i.e. with two-dimensional subspaces of an associat¬ 
ed vector space). 

As was explained at length (for) the case n = 2) in Lec¬ 
ture 25 of [1], the projective coordinates • 3C • • • • • 3C of points 
in the space P’^ {f") are given by an arbitrary basis Bq, ei, . . . 
. . ., of the space f\ For every point M in P’^ (T^) they 
represent in this basis the coordinates of an arbitrary vector 
X 6 ^ * generating the point as a one-dimensional subspace of 
the space T- This means that the coordinates x^: x\ ... 


I 







Lecture 10 


103 


... :x'^ are the Pliicker coordinates of that one-dimensional 
subspace. 

More generally, we can define (relative to a given projec¬ 
tive coordinate system) the projective coordinates of an 
arbitrary plane P’* {M) c: (^T) as the Pliicker coordi¬ 

nates of the subspace These coordinates are thus of the form 
• • • 

where iQ, ii, . . = 0, 1, . . n, and are 

subject to the following conditions: 

(i) for any permutation a ^ *5^+1 we have 

p^GiO)• • ‘^aCr) _ 


(it is assumed that o acts on the numbers 0, 1, . . r); it 

follows from this condition that there are only 

essential coordinates among the coordinates p® “ they 
are, for example, the coordinates 


pioU>.9ir when iQ<C.ii<C • • • <iir 7 


(ii) for any indices Iq, ii, . . ., /o» /i» • • •» 7r the 

Pliicker relation 

(11) p[ioii.. .irpiol=0 

holds. 

It can be shown that there are exactly 

ln + i\ 

{i2y J-(r+l)(n-r)-l 

independent relations (11). 

A straightforward proof of this fact calls for rather sophis¬ 
ticated combinatorics. Next semester we shall develop a 
general technique for computing such constants with the aid 
of which the number (12) can be trivially obtained. 

It is possible to develop a geometry whose basic elements 
(“points”) are r-dimensional planes in an w-dimensional pro¬ 
jective space (or equivalently r -f 1-dimensional subspaces 

of an w -f 1-dimensional vector space). Analytically this 

• • • 

can be done using the coordinates p i.e. in other 

words by means of identifying r-dimensional planes of ap 



104 


Semester 2 


w-dimensional projective space with the points of an iV-di- 
mensional projective space, where N= —I? hav¬ 
ing the coordinates h h<^ • • • < h- 

Suppose, for example, r = — 1 (the case of hyperp¬ 

lanes). Then, as noted above, essential coordinates are the 
w + 1 coordinates 


On the other hand, hyperplanes are obviously given by a 
single linear equation for the coordinates C(/ m 30 . . • • *30 , the 
numbers Qq, • • •» qn^ as can be verified without difficul¬ 
ty (do it!), being exactly the coefficients of the equation. 
In the first semester (see Lectures 24 and 26 of [1]) we took 
as the coordinates for straight lines in the plane (the case 
n = 2, r = 1) and planes in space (the case n = 3, r — 2) 

the coefficients of their equations. We thus see that the Plii- 

• • • 

cker coordinates are a direct generalization of 

these coordinates. 


The fact that for r = n — 1 the coordinates ’** obey 
no nontrivial Plucker relations (11) means that in represent¬ 
ing hyperplanes by^ points of a ((-r) — l)-dimensional 
projective space we obtain the whole of the space. Since 

= w, it follows that the geometry of hyperplanes 


is equivalent to that of points and in any case is not more 
complicated than the latter. 

The situation is different already for straight lines in a 
three-dimensional space (the case w = 3 and r = 1). Here 


we have ( 2 ) ~ ® essential coordinates ^ 3, 


which thus determine (in view of homogeneity) a point in a 
five-dimensional space KP®. Besides, these six coordinates 
must satisfy one more relation 


p 01 p 28 _[_ pl 2 p 08 _[_ p 02 p 81 _ Q 


(see above; the indices are decreased by 1 because now they 
take values from 0 to 3), which defines’ in KP® a “second- 
decree hypersurface’\ Thus we see that the geometry of 




Lecture 10 


105 


straight lines in space is equivalent to that of points of some 
“curved” hypersurface in a five-dimensional space. It is no 
wonder therefore that the geometry of straight lines in space 
is much more complicated than, say, the geometry of planes. 
It is for this reason that we actually entirely ignored this 
geometry in the first semester. 

Still more complicated of course is the situation for any r 
and n. The variety of r-dimensional planes of an w-dimension- 

al space is represented by points of a — l)-dimen- 

sional space lying in the intersection Nn,r of second-degree 
hypersurfaces. This intersection is called the Grassman man¬ 
ifold and has been intensively studied for many years. We 
have not yet got a comprehensive knowledge of its geometry, 
however. 




Lecture II 


Symmetric and skew-symmetric bilinear functionals - A mat¬ 
rix of symmetric bilinear functionals •The rank of a bilinear 
functional-Quadratic functionals and quadratic forms-La¬ 
grange theorem 


By analogy with skew-symmetric functionals symmetric 
multilinear functionals are defined to be functionals B in 
Jp (T) (or in (T)) such that 

oB = B 

for any permutation a 6 Sp. The theory of such functionals, 
however, turns out to be very complicated and up to now 
very little is known about them in the general case. The 
only exception is the case p = 2, i.e. the case of bilinear 
functionals. We shall now deal with these functionals. For 
definiteness we shall consider functionals of vectors (i.e. 

in T. (T)). 

Since the group S 2 consists of two elements only, an 
identity permutation and a transposition (12), a bilinear 
functional B is 

symmetric 1 f 5 (x, y) =5 (y, x), 

skew-symmetric } \ 5(x, y)= —B(y, x) 

for any vectors x, y 6 5^- 

Just as skew-symmetric functionals constitute a subspace 
A 2 (^ ) of ^ space T 2 (^') so the set S 2 (^)of all symmetric 
bilinear functionals is a subspace of the space (T). 

If the characteristic of the ground field K is two, then 
A 2 (^) = S 2 (5^ )• But if the characteristic of the field K is 




Lecture 11 


107 


other than two, then the subspaces A 2 S 2 (^) are 

obviously disjoint: 

A2(r)ns2(n--o. 

In what follows we always assume this to hold. 

Proposition 1. A space T 2 (^) is a direct sum of spaces 
A 2 (^) and S 2 (‘^)? i-s, any bilinear functional B can be uni¬ 
quely represented as a sum of symmetric functional ^symm 
and a skew-symmetric functional -Bgkew* 

(^) ^ ~ -®symm H” ^skew 

Proof. The uniqueness of expansion (1) is ensured by the 
disjointness of the spaces A 2 (^) and and in order to 

find at least one such expansion it is sufficient to put 


’symm 


2 


^skew — 


B — oB 


where cr is a transposition (12). □ 

Note that J?skew is none other but Alt B. 

As we know (see Lecture 5), in a given basis ei, . . ., of 
a space 5^ a functional B is uniquely defined by its matrix 

Ibu ■ . . b,A 

\bni---bj 


the elements b^ of which are given by the formula 


bfj B (e^, it f 1? . • w. 


The value B (x, y) of the functional B on arbitrary vectors 
X, y 6 f ' Is a bilinear form of their coordinates: 

S(x, 

with coefficients It readily follows that a functional B 
is symmetric if and only if so is its matrix, i.e, 

bi} = bji 

for any i, / = 1, . t n. 



108 


Semester 2 


Indeed, if a functional B is symmetric, then in particular 
B (e^, 6;) = B (ej, i.e. bij == bji. Conversely, if b^ = 
= bji, then for any vectors x, y we have 

5(y, x) = bijy^x^ = bjty^x^ = bijx^y^ = B {x, y) 
and therefore the functional B is symmetric. 

For bilinear functionals, just as for any multilinear func¬ 
tionals, the concept of rank is defined, i.e. (see Lecture 7) 
that of the smallest number of covectors in terms of which a 
functional is expressible in tensor form. On the other hand, 
especially for the bilinear functional it is possible to speak 
of the rank of its matrix (2). One would like to think that 
the two concepts of rank coincide. However, in general 
this is not true. 

Example. Let n = 2 and J? = ® e^. The matrix of 

this functional has the form 



and hence its rank equals unity. On the other hand, in ten¬ 
sor form the functional B is expressible in terms of two 
covectors and cannot be expressed in terms of one covector 
(if only because any bilinear functional of rank 1 is sym¬ 
metric). 

Definition 1. The rank of matrix (2) is called the matrix 
rank of a bilinear functional B, 

As we know (see Lecture 5), when changing to another 
basis the matrix of the functional B is multiplied on the 
left and right by matrices and C, where C is the transi¬ 
tion matrix. Since when multiplied by a nonsingular matrix 
the rank of the matrix remains unaltered (Proposition 2 of 
Lecture 2) this shows that Definition 1 is correct. 

Proposition 2. The matrix rank Tmat ^ bilinear functional 
B does not exceed its rank r: 

^mat ^ 

Proof, Let . . ., be a basis of the rank space of the 
functional B, Then 









Lecture 11 


lod 


where 6^;, / = 1, 

extend the basis 


. r are some numbers. Now if we 
. to a basis 

^ tr r+i 


of a space ^',then in the corresponding basis i, 

7 = 1, . . n, of a space T 2 (5^) (see Proposition 1 of 
Lecture 5) the formula 


r 

® == S bije^ (g) e^‘ 
i, i=i 


will hold. This means that in the basis ei, . . the mat¬ 
rix of the functional has the form 


(bii • . • b^r 0 ... O'! 


{ bri ... brr 0 . . . Oi 
0.0 


10.OJ 

Therefore its rank does not exceed r. □ 

For symmetric bilinear functionals the situation is much 
more satisfactory. 

Proposition 3. The matrix rank of a symmetric bilinear 
junctional B coincides with its rank: 


^mat = r. 

Proof. By virtue of symmetry any associated covector 
has the form 

(3) X ^ B (x, a), 

where a is an arbitrary vector. Since all such covectors are 
obviously linearly expressible in terms of covectors 



X £ (x, e^), 1 = 1, . . w, 


this proves that the rank space of the functional B is 
generated by covectors (4). Therefore the rank r = dim .5? 
equals the rank of the family of covectors (4). But it is clear 
that the coordinates (in the basis e^, . . ., e^) of covectors (4) 


k 







Ho 


Semester 2 


are the columns of matrix (2). Hence the rank r equals the 
rank of this matrix. □ 

Note that this proof obviously remains valid for skew- 
symmetric bilinear functionals as well, so that r^^sit = ^ for 
them too. 

Besides, in both cases covectors of the form (3) obviously 
form a subspace. Hence for (skew) symmetric bilinear func¬ 
tionals the rank subspace consists of associated covectors 
(not only is merely generated by them). 

Definition 2. A functional Q: x Q (x) ^ K is said to be 
quadratic if there exists a bilinear functional B such that 

(5) Q {x) = B (x, x) 

for any vector x 6 5^- 

Expanding B by formula (1) and taking into account the 
fact that 5gkew (x, x) = 0 we find that the functional B in 
formula (5) may be assumed to be symmetric without loss 
of generality. 

It is easy to see that then the functional B is uniquely de¬ 
termined by the functional Q, i.e. in other words the cor¬ 
respondence 

B Q 

is a bijection between a vector space S 2 (^) of symmetric 
bilinear functionals and the set of all quadratic functionals 
in ^(we assume as before that the characteristic of the 
ground field K is other than two). 

It is indeed clear that if Q (x) = 5 (x, x) and the function¬ 
al B is symmetric, then 

= B {X, y) 

for any vectors x, y 6 5^- □ 

It makes no difference whatsoever in principle therefore 
whether one considers symmetric bilinear or quadratic func¬ 
tionals, for any statement about quadratic functionals can 
be reformulated as a statement about symmetric bilinear 
functionals and vice versa. We choose quadratic functionals 
as basic leaving it to the reader to reformulate the statements 
about them in terms of symmetric bilinear functionals. 



Lecture 11 


111 


To simplify the notation we shall designate the symmetric 
bilinear functional corresponding to the quadratic function¬ 
al Q by the same symbol Q, Its rank will be called the 
rank of the quadratic functional Q, 

In every basis ei, . . of a space the quadratic func¬ 
tional Q is given by its matrix 

(6) h • ■ • ) 

XQni • • • Qnnf 

whose elements are defined by the formula 

(lij = Q (e^, e^), / = 1, . . W-. 

Matrix (6) is a quadratic symmetric matrix of order n and 
the correspondence 

“a functional” “its matrix” 

is a bijective correspondence between the set of all quadratic 
functionals in TT and the set of all symmetric matrices of 
order n with elements of the field K. This correspondence 
depends on the choice of basis, in another basis matrix (6) 
being multiplied on the left and right by matrices and C, 
where C is the transition matrix. Assuming the basis to be 
fixed we designate matrix (6) by the same symbol Q that is 
used for the quadratic functional, in order not to introduce 
new letters. 

The rank of matrix (6) equals that of the functional Q 
(Proposition 3). 

For any vector x = of a space f" we have 

Q (x) = qtjxy 

= 9ii 4- 2gi2x‘a:2 +-2(7i„a:‘a;" + 

+ ?22 + ... + + 


+ qnn (*”)^) 


or in matrix form 


Q (x) = x'^Qx, 




112 


Semester 2 


where as over 



and Q is matrix (6). 

Definition 3. A polynomial Q of variables 

x^, , . x^ is said to be a quadratic form if it is homogeneous 
(if all its members are of the same degree) and has degree 2. 
Gf. Lecture 14 of [11. 

Any quadratic form has the form 

Q {x^^ .. ., a:”) = qijX^x^ = 

= <lii + ... + 2q^j,x^x^ + 

+ ^22 + . . . + + 

+ qnn 

and is therefore uniquely determined by the matrix 

/qa ^12 • • • q\n\ 


\ qm qn2 * • • qnn / 

which is called the matrix of the quadratic form. 

Thus we see that the value Q (x) of an arbitrary quadratic 
functional Q on a vector x 6 is expressed by the quadratic 
form 

Qix) = Q (x\ 3f^) 

in the coordinates x^^ . . of that vector. 

This establishes a bijection (dependent on the choice of 
basis) between quadratic functionals and quadratic forms. 

Definition 4. Two quadratic forms are said to be equiva¬ 
lent if they correspond to the same quadratic functional in 
different bases. 





Lecture 11 


113 


Two quadratic forms may also be said to be equivalent if 
for their matrices Qx and Q 2 there holds an equation of the 
form 

Qz - 

where C is some nonsingular matrix. 

But if we introduce homogeneous linear transformations 



= c\x^ + . • • + 

yU _ + . . . + 


with nonsingular matrices 

• • • • • Oj 

it may be said that the^ form (x^, . . x^) is equiva¬ 
lent to the form Q 2 x^) if there exists a transfor¬ 

mation (7) such that on designating the variables of the 
form by the symbols y^, . . y^ and substituting (7) we 
obtain the form 

Scalar multiplication introduced in Lecture 13 of [1] is a 
special case of symmetric bilinear functional, characterized 
by the positivity axiom 15°. If this axiom is discarded, then 
instead of Euclidean spaces we obtain simply spaces ^ in 
which some symmetric bilinear (or, equivalently, quadratic) 
functional Q is given. Such spaces are commonly called 
pseudo-Euclidean spaces (sometimes only when K = 01). 

Following the analogy with the Euclidean case vectors 
X, y 6 ^ ar© said to be orthogonal with respect to Q or, brief- 
ly, Q-orthogonal if 

Q (x. y) = 0. 


The question arises: is there an analogue of the Gram- 
Schmidt orthogonalization process (see Lecture 14 in 1) 
for pseudo-Euclidean spaces? Since the concept of an ortho¬ 
normal family cannot be extended to pseudo-Euclidean 
spaces (how is it possible to normalize a vector x ^ 0 for 


8-01325 



114 


Semesfer 2 


which Q (x) = 0?) it is natural to pose the question about 
transforming an arbitrary basis only into a Q-orthogonal ba¬ 
sis ei, . . i.e. such that 

Q ^i) = 0 for i ^ 

The answer to the question turns out to be yes. 

Theorem 1 (Lagrange theorem). For any quadratic functional 
Q in there exists a basis ei, . . e,i of the spaced*such that 

(8) Q (Cj, e^) = 0 for i . j 

Proof. We shall not only prove the theorem, but also indi¬ 
cate a practical algorithm allowing an arbitrary basis of a 
spaced ‘ to be transformed into a basis possessing property (8). 

The algorithm is called the Lagrange algorithm. It con¬ 
sists in applying sequentially three elementary transfor¬ 
mations one of which we shall call basic and the other two 
auxiliary. 

Basic Lagrange transformation. It is applied to a basis 

Oi, . . ., Gfi if 

Qii = Q (ci) ¥= 0, 


It 

( 9 ) 


converts the basis into a basis 






Qin 

Qn 


"f" 


The resulting basis has the property that its first vector is 
^-orthogonal to all the rest: 


eJ)-=0 for i > 1. 

Indeed 


Q (©1, el) = Q ^01, — ^2 + ^1*) — ^ 

If now ^22 = Q (e', e') ^ 0, then applying to the vectors 
e', . . Cn (i.e., more precisely, to the restriction of the 
functional ^ on a subspace [e^, . . ., ej) the same trans- 


j 






Lecture 11 


115 


formation we obtain a basis e", e", . . Cn the first two 
vectors of which e'^ and are ^-orthogonal to each other and 
to the other vectors, and so on. 

Where this construction continues indefinitely, i.e. every 
time (until we exhaust the basis or obtain a zero functional) 
the basic transformation is applicable. Theorem 1 thus turns 
out to be proved. This case is said to be regular. 

But if at some stage the basic transformation (9) turns 
out to be inapplicable, then one should make auxiliary 
transformations which result in a basis to which transforma¬ 
tion (9) is now applicable. 

First auxiliary transformation. It is applied when qn = 0 
but there is an index io such that 0. It consists in 

permuting the iQth vector of the basis to the first place 

ei = ei if i^i, io. 

It is obvious that in the new basis 0. 

Second auxiliary transformation. It is applied when qn = 
— 0 for all i = 1, . . ., n but the functional Q is not zero 
and therefore there exist indices i^ and /o such that qt^t^ 

7 ^ 0. If, for example, =7^ 0 {this assumption does not lead 
to any loss of generality, of course), then the transforma¬ 
tion considered is given by the formulas 

— Cl -j- 02, 

if i^2. 


Then 


^11 — Q (Ci) — (ci + C2> Cl C2) — 2^12 =7^ 0 

and it is possible to apply the basic transformation. 

None of the transformations is applicable if and only if all 
the coefficients qij are zero, i.e. if ^ = 0. But in this case 
any basis is obviously ^-orthogonal and therefore one need 
not do anything with it. 

Consequently, applying our transformations in the neces¬ 
sary succession we sooner or later obtain a ^-orthogonal 
basis. □ 



116 


Semester 2 


In a ^-orthogonal basis a matrix of the form Q is ob¬ 
viously diagonal, i.e. has the form 



X, 

0 


0 



and hence 


Q (x) = Xi 

for any vector x. In terms of quadratic forms therefore the 
Lagrange theorem asserts that anj/ quadratic form Q ... 

. . x^) is equivalent to a form 

(11) X,{X^)^+ ... 

The form (11) is said to be of normal form. Thus we see 
that any quadratic form Q {x^^ • • •? x^) can be reduced to a 
normal form (11) by means of a nonsingular linear transfor¬ 
mation (7). 

The last statement also known as the Lagrange theorem 
fully relates to algebra and all traces of its geometric origin 
have disappeared in it. It is therefore applicable to quadrat¬ 
ic forms arising in any questions (say in mechanics) that 
are a priori in no way connected with the geometry of quad¬ 
ratic functionals. 

In practice, reducing a quadratic form Q {x^, . . ., x^) to 
normal form should be carried out by successively “selecting 
squares”, i.e. by using the identity 

Q{x^, .a;") =-^(^',1 ...+ ^in^T + <?' 

yii 

where the form Q\ as can be easily seen, has already no 
variable This identity corresponds to the basic Lagrange 
transformation. In the irregular case one has in addition to 
renumber the variables and use transformations of the form 

yi = — ^2. 

yi = Xi, if 2. 

Some of the coefficients Xi, . . ., X^ (or all of them) of 
the form (11) may be zero. It is clear that the number r 






Lecture 11 


117 


of the nonzero coefficients equals the rank of the matrix ( 6 ) 
and hence the rank of the functional Q. Transposing the 
elements of the basis, if necessary, we can always see to it 
that the first coefficients . . ., Xj. should be nonzero 
Since it is not necessary to write the terms with zero coef¬ 
ficients we finally find that the normal quadratic form of 
rank r is the form 

+ • • • +K where 7 ^ 0 , ,,,, 



Lecture 12 


Jacobi theorem • Quadratic forms over the fields of complex 
and real numbers • The law of inertia • Positively definite 
quadratic functionals and forms 


Recall that a quadratic matrix is said to be triangular (more 
precisely, upper-triangular) if all its elements below the 
principal diagonal are zero. The determinant of such a 
matrix is obviously equal to the product of the diagonal 
elements of the matrix. A triangular matrix therefore is non¬ 
singular if and only if all its diagonal elements are nonzero. 
Of particular importance are triangular matrices all diago¬ 
nal elements of which are equal to unity. We shall call such 
matrices unitriangular matrices, 

A direct computation shows that a product of two (uni)- 
triangular matrices and the inverse of a (uni)triangular 
matrix are also (uni)triangular matrices. 

Since the matrix of the basic transformation in the La¬ 
grange algorithm is a unitriangular matrix 



it follows that in the regular case transition to a Q-orthogonal 
basis is effected by a unitriangular matrix. 

Let A be I an arbitrary quadratic matrix of order n and 
I ^ k ^ n. Eliminating the last n — k rows and n — k 
columns from the matrix A we obtain a quadratic matrix of 
order k. 


I 








Lecture 12 


119 


Definition 1. This matrix is CdlleA ih.e principal submatrix 
of order k of the matrix A and its determinant is called the 
principal minor of order k of the matrix A, 

Let Q and Q' be matrices of a quadratic functional Q in 
two bases ei, . . and e', . . connected by a uni- 
triangular transition matrix C. Then the following obvious 
assertions hold. 

(a) The principal submatrix Ck of order k of the matrix C 

is a transition matrix connecting the bases ei, . . and 
ej, . . of a subspace = [e^, ...,6^1 = [e', . . . 

• • • j 0^1 • 

(b) The restriction Q \ of the functional Q to the 

subspace Is a quadratic functional whose matrix in the 
basis ei, . . is the principal submatrix Qkoi order k 
of the matrix Q, the matrix in the basis e', . . being 
the principal submatrix Qk of order k of the matrix Q'. 

It follows that 


Q’k=CjQ,,C^ 


for any k = 1, , , n. Switching to determinants and 
considering that det Ck — det = 1 it follows that 


(1) det Qh=diQtQk 

for any = 1, . . n. 

In particular, if 



then 


det — ^1 •. . 'kk 


for any k = I, . . n. 

This proves (we pass to the language of quadratic forms) 
that if for a quadratic form Q x^) the regular case 

holds then the coefficients o/ its normal form 

satisfy the relations 

(2) A/j . . . Xk =■ D'k’t ^ ~ • • •? 


where Dk are the principal minors of the quadratic form, □ 



120 


Semester 2 


Note now that carrying out the basic transformation in the 
Lagrange algorithm we obtain every time a nonzero coef¬ 
ficient X (for example the very first transformation yields 
the coefficient Xi = qn ^ 0). In the regular case the process 
comes to a stop when after some (say the rth) step we obtain 
an identical zero (so that all the remaining coefficients 

Xn turn out to be zero). It follows from this and 
relations ( 2 ) as well, firstly, that in the regular case 

(3) Di^O, . . ., =7^ 0, 

where r is the rank of the form (functional) and, secondly, 
that 





Dr 

Dr-i • 


Conversely, suppose that for the matrix of a quadratic 
form inequalities (3), where r is the rank of the matrix, hold. 
Then, since = Di^ 0 the basic transformation of the 
Lagrange algorithm is applicable to the form. According to 
formula ( 1 ) the principal minors of the matrix resulting 
from the transformation will coincide with those of the 
matrix Q and therefore this matrix will possess properties (3) 
as before. But the principal minor D 2 of the matrix Q' is 
obviously equal to the product q\^ql^ (where q[^ = qn — X^) 
and hence q^^ 0. Consequently the basic Lagrange trans¬ 

formation is applicable again to the restriction of the functio¬ 
nal Q to the subspace [e', . . ., enl, etc. 

After r steps we obtain a matrix of the form 


X, 


0 



0 


K 


\G 


where = 7 ^ 0, . . ., Xj. ^ 0 and G is some matrix. But since 
the matrices of the functional have the same rank r in all 
the bases, the matrix (4) has rank r too, which is obviously 
possible if and only if all the elements of the matrix G 
are zero. 

Thus the matrix (4) is the matrix of a normal quadratic 
form and since we have obtained it using only the basic 







Lecture 12 


121 


transformations of the Lagrange algorithm it follows that 
the regular case holds for the original form. 

We have thus proved the following theorem. 

Theorem 1 (Jacobi theorem). For a quadratic form of rank 
r the regular case holds if and only if the principal minors of 
the form are nonzero: 

Di = 7 ^ 0 , . . Dj. = 7 ^ 0 . 

The Lagrange algorithm reduces such a form to 

Di ^ (*")" + ■■■ + -w^ (*’■)'• □ 

This theorem is often very helpful. 

A further simplification of the normal form 
(5) + ... “h 

of a quadratic form depends on the arithmetic properties of 
the field K. The simplest case arises when K — V. Using 
in this case a transformation of the form 

y^ = Y'k^ x\ 


/+! = 


we can reduce the form (5) to the following (we omit the 
primes in the notation for coordinates) 

( 6 ) [x^Y + . . . + {x^f- 

This proves the following proposition. 

Proposition 1. Any quadratic form over the field C {i,e, 
with coefficients in C) can he reduced by a linear nonsingular 
transformation of variables {also with coefficients in C) to the 
form (6) where r is the rank of the form, □ 

In other words, any quadratic form Q {x^, . . x'^) of 

rank r over the field C is of the form 

(Pi{xY+ ...+^Pr (^)^ 





122 


Semester 2 


where cpi (x), . . cp^ (^) are linearly independent linear 
^orms in x^. 

Corollary (theorem on the classification of quadratic 
forms over the field C). Two quadratic forms over the field 
C are equivalent if and only if their ranks are equal, □ 

Over the field H of real numbers we can make the trans¬ 
formation 


y^ = V\K \ 


y'' = V\K\ 


y'^ = 

reducing the form (5) (possibly after some additional rear¬ 
rangement of coordinates) to the form (we again omit the 
primes in the coordinates) 

(7) {x'^f + . . . + {x^f - _ . . . _ 

where r is the rank of the form and p some number (satisfying 
the inequalities 0 ^ p ^ r). 

This proves the following proposition. 

Proposition 2. Any quadratic form over the field IR can be 
reduced by a linear nonsingular transformation of its variables 
to the form (7) where r is the rank of the form and ^ r. □ 

In connection with Proposition 2 the question immedi¬ 
ately arises as to whether it is possible or not to reduce a given 
quadratic form to two forms (7) with distinct p. It turns out 
that the answer to this question is negative. 

Proposition 3 (the law of inertia of quadratic forms). 
If two forms 

( 8 ) {xy + ... + {xy - {x^+y ( 0:^)2 

and 

(9) {y^ + . .. + {y’^Y - - ... - {fY 

are equivalent {over the field 01), then p = q- 







Lecture 12 


123 


Proof. The equivalence of the forms (8) and (9) means 
that they are expressions in two different bases ei, . . e„ 
and fi, . . fn for the same quadratic functional Q given in 
an w-dimensional vector space T". Let be a subspace of the 
space generated by vectors ei, . . ep and let® be a 
subspace of generated by vectors f^+i, . . in- 

Since the functional Q is expressed by the form (8) in the 
basis ei, . . e^, for any nonzero vector x 6 we have 
the relation 

Q (x) = {x^f + . . . + {x^T > 0. 

Similarly, for any vector y 6 fi we have 

Q (y) - ... - 0. 

Therefore fl fi = 0? I*©* tbe subspaces # and are dis¬ 
joint and hence (Corollary 2 of Theorem 1 in Lecture 1) 
for their dimensions there holds the inequality 

dim (9^ + dim fi ^ w, 


i.e. the inequality 

p {n — q)^n 
equivalent to the inequality 

P <Q- 

Similarly for q ^ p- Therefore p ==^ q.U 

Proposition 3 guarantees the correctness of the following 
definition. 

Definition 2. The number p of “positive squares” in the 
reduced form (7) is called the positive inertial index of a given 
quadratic form (quadratic functional) and the number 
r — p of “negative squares” is called the negative inertial 
index. 

In addition Proposition 3 immediately yields the follow¬ 
ing corollary. 

Corollary (theorem of the elassifieation of quadratic forms 
over the field ^). Two quadratic forms over the field U are 
equivalent if and only if their ranks and inertial indices coin¬ 
cide. 





124 


Semester 2 


Of particular importance in vector spaces over the field 
K are quadratic functionals Q possessing the property that 
^ (x) > 0 when x = 7 ^ 0. Their importance is due to the fact 
that the corresponding symmetric bilinear functionals are 
precisely all possible scalar multiplications in (see Defini¬ 
tion 2 of Lecture 13 in [1]). 

Definition 3. A quadratic functional ^ in a real vector 
space T" is said to be positive definite if ^ (x) > 0 for any 
vector X = 7 ^ 0 . 

A quadratic form Q , , ,, x^) is said to be positive 
definite if it is an expression for a positive definite funtional 
in some basis, i.e. in other words if Q > 0 

when x^) = 7 ^ ( 0 , . . 0 ). 

A matrix Q is said to be positive definite if it is the matrix 
of a positive definite quadratic functional (quadratic form), 
i.e. in other words is the matrix of the metric coefficients 
of some basis of a Euclidean space (see Lecture 14 in [1]). 

Proposition 4. A quadratic functional {quadratic form) 
is positive definite if and only if its rank and positive inertial 
index are equal to n\ 


r = n, p — n. 

Proof. U p = r n, then in some basis the functional Q 
is expressed by the form 

(x^)^ + ... + {xy 

and hence ^ (x) = 0 if and only if = 0, . . ., = 0, 

i.e. if X = 0. 

Conversely, if p < or r < ^ then in some basis ei, ... 

. . ., the functional Q is expressed by a form 

Q' {x\ . . ., x^-^) + 8 {xy, 

where Q' (x^, . . ., x^''^) is a quadratic form in the coordi¬ 
nates x'^''^ and 8 0. Then Q (e^) = 8^0 and 

hence the functional Q is not positively definite. □ 

This proposition involves a preliminary reduction of the 
quadratic form to its normal form and hence tends to be 
useless in practice. Of more interest is the following propo¬ 
sition providing the necessary and sufficient conditions for 
the positive definiteness of a quadratic form directly from 
its matrix. 







Lecture 12 


125 


Proposition 5 (Sylvester’s criterion). A matrix Q is posi¬ 
tive definite if and only if all of its principal minors are posi¬ 
tive: 


<lii > 0, 


Qii 

Q2i 


Qi2 

^22 


> 0 , 


Qil Qi2 
^21 ^22 ^23 

^31 ^32 ^33 


> 0 , 


• • * Qin 


> 0 . 


Qni • * • Qnn 


Proof. If all principal minors of the matrix Q are positive 
(and hence nonzero), then by Theorem 1 for a quadratic 
form with matrix Q the regular case holds and the form re¬ 
duces to the form 


Di (x2)2 + ... + {x'y, 

where > 0, Dg > 0, . . Dn > 0. Thus p = r = n 
and therefore the quadratic form (and hence also the matrix) 
is positive definite. 

Conversely, if a form with matrix Q is positive definite 
then it can be reduced to a sum of n squares, i.e. to a form 
with unit matrix E, Therefore (cf. Lecture 14 of [11) the 
matrix Q has the form 

Q = CTC, 


where C is some nonsingular matrix. Hence 

det Q = (det C)^ > 0. 

This proves that the determinant of a positive definite matrix 
is positive. 

On the other hand, on setting in a quadratic form 
Q {x^, . . ., x^) in n variables the last n — k variables 
tic , . . ., tic equal to zero we obtain a quadratic form 


Qh {xK 


x^) — Q (x*, 


X 


k 


0 , 


0 ) 


in k variables x^, . ., x^ for which obviously are true the 

following assertions. 



126 


Semester i 


(a) If a form Q . . , x'^) is positive definite, so is the 

form Qk x^), 

(b) A matrix of the form Qn {x^^’, • • •» x^) serves as the 

principal submatrix oi order /c of a matrix of the form 
Q {x^, . . x^). 

Consequently, by virtue of the above remark all principal 
minors Z)ft, /c = 1, . . n, of a positive definite matrix 
are positive. 

Proposition 5 answers in particular the question, put in 
Lecture 14 of [1], concerning the necessary and sufficient 
conditions a quadratic matrix must satisfy in order to be 
the matrix of the coefficients of some basis of a Euclidean 
space. 







Lecture 13 


Second degree hypersurfaces in an n-dimensional projective 
space • Second degree hypersurfaces in a complex and a real- 
complex projective space • Second degree hypersurfaces in an 
n-dimensional affine space • Second degree hypersurfaces in a 
complex and a real-complex affine space 


Let us apply the results obtained on quadratic forms in the 
preceding lectures to the investigation of second degree 
hypersurfaces in an n-dimensional projective space. 

Definition 1 (cf. Definition 2 of Lecture 25 in [Ij). A sec¬ 
ond degree hypersurface in an n-dimensional projective space 
(over an arbitrary field K of characteristic other than two) 
is a set of points whose projective coordinates x^.x^: . . . 
. . Xn satisfy an equation of the form 

Q • • •» ^n) ~ ^ 

where Q {xq, Xi^ . . Xn) is some quadratic form in the 
coordinates Xq, Xi, . . Xn. (Now it is convenient to use sub¬ 
scripts in the coordinates.) 

The Lagrange theorem immediately yields the following 
theorem. 

Theorem 1 (reduction of the equations of second degree 
hypersurfaces in an n-dimensional projective space over an 
arbitrary field K to normal form). For any second degree 
hypersurface in an n-dimensional projective space over a field 
K of characteristic other than two there exists a system of pro¬ 
jective coordinates Xq \Xi;., . .: x^ in which the equation of the 
hypersurface has the form 

( 1 ) ^ 0 ^ 0^1 ^1 “ 1 “ 

where 0 ^ r n and 0 , . . 0 . □ 





128 


Semester 2 


When r = n the hypersurface (1) is called an oval second 
degree hypersurface. 

It is obvious that for any /c—1-dimensional plane Ho in a 
projective space and any point M $ IIo in it there exists a 
unique fc-dimensional plane MUq containing M and Hq. 

Definition 2. A hypersurface in an n-dimensional projec¬ 
tive space (over an arbitrary field IK) is said to be a k-fold 
cylinder (or a k-fold cone, the concepts of cylinder and cone 
coinciding in projective space) if there exists a k — 1 -dimen¬ 
sional plane IIq (the axial plane of the cylinder) such that 
for any point M of the hypersurface not lying in the plane 
Ho the plane MIIq lies entirely in the hypersurface. 

Every n —fc-dimensional plane 11 having no points in 
common with the plane IIq cuts the cylinder in a hypersur¬ 
face in n which is called a base of the cylinder. A cylinder is 
also said to be a cylinder over its base. It is obvious that 
every cylinder is a union of all fc-dimensional planes of the 
form Milo, where M is an arbitrary point in the base of the 
cylinder. In this sense the geometry of a cylinder is complete¬ 
ly reducible to the geometry of its base. 

Having all this in mind, consider a hypersurface (1) for 
r a n. Let 11 be a plane of dimension r defined by n — r equa¬ 
tions Xj.+i = 0, . . ., Xn = 0. In the plane the numbers 
Xq, Xi, . . Xr are projective coordinates and in these equa¬ 
tion (1) defines some oval second degree hypersurface. Also 
let Ho be a plane of dimension n — r — 1 defined by r + 1 
equations Xq = 0, Xi = 0, . . Xr = 0. 

The fact that together with some point {x^ : x^ : ... 
. . . : Xn^ the hypersurface ( 1 ) contains all points of the 
form {x^Q^ : x^^^ : . . . xl!^^:xj.+x :. . . Xn), where .. . x^ are 
arbitrary numbers, obviously means that that hypersurface 
is an n-r-fold cylinder with axial plane Uq. Serving as the 
base of the cylinder is the hypersurface defined in the plane 
n by ( 1 ). 

This proves the following theorem. 

Theorem 2 (enumeration of second degree hypersurfaces 
of an ^-dimensional projective space over an arbitrary 
field K). Every second degree hypersurface in an n-dimensional 
projective space over a field K of characteristic other than two 
is either an oval hypersurface or a k-fold {i ^k ^n) cylin- 





Lecture 13 


129 


der over an oval hypermrface in an n — k-dimensional pro¬ 
jective space, □ 

A one-dimensional projective space is a straight line and 
an|[oval “hypersurface” in it is a pair of distinct points (or 
an empty set). The corresponding n — 1-fold cylinder there¬ 
fore is a pair of distinct hypersurfaces. 

For k = n the situation is more intricate. A zero-dimen¬ 
sional projective space is a point and an oval hypersurface 
in it is an empty set. At the same time the equation arj = 0 
defines a “double” hyperplane = 0 in an ^-dimensional 
{n > 0) projective space. To bring this case to common ter¬ 
minology therefore one has to assume that an w-fold cylin¬ 
der over an empty set is a hyperplane in an yi-dimensional 
space and that in a zero-dimensional projective space an oval 
second degree hypersurface is a “doubled” empty set. 

It is also convenient to introduce the concept of a Q-fold 
cylinder over a given hypersurface meaning by that cylinder 
the hypersurface itself. Then any second degree hypersurface 
in an w-dimensional projective space will be a &-fold 
(0 ^ /c ^ 7i) cylinder over some oval hypersurface in an 
n —fc-dimensional S3ace. 

i. 

In the case K = C all the coefficients Xi, of 

equation ( 1 ) may be assumed to be equal to unity. For any 
r, 0 ^ r ^ w therefore there is only one hypersurface ( 1 ) 
and by rank invariance these hypersurfaces are not pro- 
jectively equivalent when r are different. This proves the 
following theorem. 

Theorem 3 (classification of second degree hypersurfaces 
of an n-dimensional projective space over the field C). In an 
n-dimensional complex projective space there are only n -{■ i 
projectively non-Euclidean second degree hypersurfaces, one 
oval hypersurface and, for any r, 0 ^r ^ n — I, an (n — r)- 
fold cylinder over an oval hypersurface in an r-dimension- 
al space, □ 

In the case K = (R the geometrical situation, as we know 
from [ 1 ], is not adequate to the algebraic one and one has 
to introduce real-complex spaces (i.e. to pass to the situation 
(C, R); cf. Lecture 20 in [1]). 

We stress that the algebraic situation remains unaffected 
in this case: all transformations of coordinates continue to 


9-01325 








130 


Semester 2 


be transformations over R and all equations have real 
coefficients. 

^ A second degree hypersurface in a real-complex projective 
(or affine; see below) space is said to be s-planar {s^ —1) if 
the hypersurface contains no 5 -f- 1-dimensional plane but 
through any of its real points at least one (real) 5-dimension¬ 
al plane passes contained entirely in the hypersurface. 
In a three-dimensional space, for example, a hyperboloid of 
two sheets is 0-planar and a hyperboloid of one sheet is 
1-planar. A hypersurface is 1-planar if and only if it con¬ 
tains no real points. 

In the situation (C, R) equation (1) can be reduced to 
the form 

2) —^p+i—• • • 

it being possible owing to the multiplication of the equation 
by —1 to assume without loss of generality that 


1“ I 1 

where — ^ is the integral part of the number 


r + l 


(if 



= m, then r = 2m or r = 2m — 1). 


When r — n the hypersurface (2) is called a nonsingular 
second degree hypersurface. It can be shown (do it!) that the 
nonsingular hypersurface (2) is p-planar. Thus, in particular, 
for p = —1 the nonsingular hypersurface (2) has no real 
points. It is called an imaginary oval second degree hypersur¬ 
face, When p == 0 the nonsingular hypersurface (2) is called 
a real oval second degree hypersurface. When p > 1 and 
r = n the hypersurface (2) is called in an unsophisticated 
way a nonsingular p-planar second degree hypersurface. 

Since p-planarity is obviously a protectively invariant 
property, all nonsingular hypersurfaces (2) are protectively 
not equivalent. 

When r a n the hypersurface (2) is an w — r-fold cylinder 
over a nonsingular hypersurface in an r-dimensional space, 
given by the same equation (2). Therefore all hypersurfaces(2) 
are protectively net equivalent either. 




I 









Lecfure 13 



This proves the following theorem. 

Theorem 4 (classification of second degree hypersurfaces 
of a real-complex n-dimensional projective space). In a 
real-complex n-dimensional {n > 0) projective space there are 


Tt 1 ^ ^ 

only —y + 1 projectively 


non-equivalent^ nonsingular sec¬ 


ond degree hypersurfaces that are not cylinders: two oval hy¬ 
persurfaces {an imaginary and a real one) and {when n'>2) 

one p-planar hypersurface for every p = 1, . . — !• 


All the other second degree hypersurfaces are k-fold (1 ^ 
n) cylinders over nonsingular hypersurfaces in an 
n — k-dimensional space {when k = n, they are double hy¬ 
perplanes). □ 

Similar theorems hold of course also in a projective-affine 
space obtained from a projective space by choosing some 
hyperplane as an ideal hyperplane. In such a space, second 
degree hypersurfaces will in addition differ in their positions 
relative to the ideal hyperplane. For example, instead of 
single /c-fold cone cylinders there arise two classes of hyper¬ 
surfaces: cylinders, if the axial plane Ho is contained entire¬ 
ly in the ideal hyperplane, {k — l)-fold cylinders over 
cones, if the plane Hq has proper points (in the case where 
Ho is a proper point there occur simply cones). Therefore 
the classification of second degree hypersurfaces even in a 
complex projective-affine space, trivial as it is, is rather 
awkward. That is why we shall not even formulate corre¬ 
sponding theorems. 

On removing from the projective-affine space the ideal 
hyperplane we obtain an affine space. Therefore a classifica¬ 
tion of second degree hypersurfaces in a complex affine 
space can be obtained from their classification in a projec¬ 
tive-affine space, the number of classes becoming only smaller. 
To attain a greater geometrical clarity, however, we prefer 
to obtain this classification directly. 


Let be an affine n-dimensional space (over a yet arbit¬ 
rary field K of characteristic other than two) and let ^ 
be an associated vector space. 

Definition 3. A second degree hypersurface in an affine 
space ^ is a subset of the space, consisting of points whose 


9 * 



132 


Semester i 


affine coordinates Xi, . . satisfy an equation of the 

form 

F (Xi, . . ^n) 

where F {xi, . . Xn) is some second degree polynomial in 
^1, • • •» ^n* Cf. Definition 2 of Lecture 18 in [1]. 

By introducing the vector x = XiBi + . . . + (i*®* 

the radius vector of a point M {x^, . . Xn)) we can write 
the equation F (xi, . . Xn) = 0 in the following “vector” 
form: 

(3) A (x) + 2a (x) + floo = 0^ 

where A is some quadratic functional: 

n 

^(x)= 2 aijXiXj, 

i, i=l 

a is some linear functional: 

n 

“(x)= 

i-l 

and Uqq is some number. (According to the notation adopted 
in the first semester, it would be necessary to write the index 
n i instead of the index 0, but for the sake of simplicity 
we prefer to change the notation.) 

By translating the origin of coordinates 0 into a point O' 
we obtain for each point M ^ ^ a new radius vector x' = 

= O'M connected with the previous radius vector x = OM 
by the relation 

x-=x'+Xo, 

where Xq = 00'. Therefore equation (3) is replaced by ine 
equation 

A (x' + Xq) + 2 a' (x' + xj + Uqq = 0, 

i.e. (we drop the prime in the notation for the vector x') by 
the equation 


A' (x) -{-2 a' (x) + — 0, 









Lecture 13 


133 


where 

A' = A, 

(4) a' = ao+a, 

^ (xo) + 2a (Xo) + ago- 

(Here the symbol a© denotes the associated covector x 
A (x, Xo).) 

Definition 4. A point with radius vector Xq = + . . . 

. . . + is said to be a centre of hypersurface (3) if 

ao + a = 0 , 

i.e. if 

n 

(5) ^io= 0 » ^ . . ., 

i=l 

Relations (5) constitute a system of n equations in n un¬ 
knowns x^'n- If the system has a unique solution, 

i.e. if there exists a unique centre, then the hypersurface (3) 
is said to be central, otherwise it is said to be noncentraL 
The determinant of system (5) is the determinant 

^11 • • • ^in 

( 6 ) 6 = . 

^ni ••• ^nn 

of the matrix of a functional A, Therefore, if 6 = 7 ^ 0, then 
according to Cramer’s rule system (5) has a unique solution. 
If, however, 6 = 0, then the matrix rank r of system (5) is 
less than n and therefore (the Capelli-Kronecker theorem) 
system (5) is either incompatible (there are no centres), 
which is the case when the rank of the matrix 

( ^10 ^11 • • • ^in \ 

. 

^no ^ni ••* ^nn ' 

is equal to r + 1 , or defines in the space a plane (a Y>lane 
of centres) of dimension n — r, when the rank of the matrix 
(7) is equal to r. 

Thus the hypersurface (3) is central if and only if 8 ^ 0, □ 




134 


Semester 2 


If the hypersurface (3) has at least one centre, then by 
translating the origin of coordinates into the centre we ob¬ 
tain for it an equation of the form 

A (x) + ago = 0. 

If Uqo =9^ 0, then we may divide the equation by Ugg without 
loss of generality. In addition, according to the Lagrange 
theorem we may choose a basis of the coordinate system so 
that we have 

(8) A{x)=XiXl-\- ... 

where 0, . . ., Xj. ^ 0. This proves that for a second 

degree hypersurface with a centre there exists a system of affine 
coordinates x^, . • Xj^ in which its equation has the form 

Xj^x\ ., -|- Xj-X^ — ^ ? 

where Xi=^ 0, , , ,, Xj, ^ 0 and e = 0, 1. □ 

When r = n the hypersurface is central and when r <C. n 
it is an w — r-fold cylinder over a central hypersurface in 
an r-dimensional space. 

Suppose now that the hypersurface (3) has no centre 
(which, we remark, is possible only when n > 1). This 
means that in the conjugate space the covector a is not 
of the form —ao, i.e. is not a covector associated with the 
bilinear functional A and hence is not in the rank space M 
of the functional A. 

We reduce the quadratic functional A to the form (8). 
This means that in the rank space M of the corresponding 
bilinear functional A we find a basis e^, . . ., such that 

A = (8) + ... + ® e^. 

(To obtain a basis e^, . . ., of a space T in which the 
values of the quadratic functional A are expressed by for¬ 
mula (8) one should extend this basis to a basis e^, . . ., 
of the space and change to the conjugate basis.) 

Since a ^ it is clear that we may choose a basis 
. . ., so that we have = —a, i.e. so that a (x) = 
= —for any vector x 6 


1 







Lecture 13 


135 


In such a basis, for any initial point O an equation of the 
form (3) becomes 

(9) ... -i-Xr^r — 2aJr+l +^00—0* 

By translating the origin of coordinates O into a point with 
coordinates 


( 


0 , 


0 


r times 


^00 n 
2 ’ * 



we obviously (see the last of the formulas (4)) obtain an 
equation of the form (9) with aoo = 0. 

This proves the following theorem. 

Theorem 5 (reduction of the equations of second degree 
hypersurfaces in an n-dimensional affine space over an 
arbitrary field K to normal form). For any second degree 
hypersurface in an n-dimensional affine space over a field K 
of characteristic other than two there exists a system of affine 
coordinates in which its equation has either the form 

(I) ^1^1+ ... 

where i ^r ^n and e = 0 or 1, or {which is possible only 
when « > 1) the form 

(II) X^x\-\- ... ^XrXl=z2Xr^i, 

where i ^r ^n — 1, with Xi 0^ . . X^. 0 in both 

cases. □ 

When r = n and e = 1 the hypersurface (I) is called an 
oval second degree hypersurface. When r = n and e = 0 it is 
called a second degree cone and is a cone over an oval hyper¬ 
surface in an w — 1-dimensional space. When r cin the 
hypersurface (I) is an w — r-fold cylinder whose base is 
either an oval hypersurface (when e = 1) or a second degree 
cone (when e = 0) in an r-dimensional space. 

The hypersurface (II) is called a paraboloid, when r = 
— n — 1. When r<C^^“—1 it is an — r — 1-fold cylin¬ 
der over a paraboloid in an r + 1-dimensional space. 

Thus the following theorem holds. 

Theorem 6 (enumeration of second degree hypersurfaces 
in an /^-dimensional affine space over an arbitrary field K). 





136 


Semester 2 


Every second degree hypersurface in an n-dimensional affine 
space over a field K of characteristic other than two is either 

(a) an oval hypersurface or 

(b) a cone or 

(c) a paraboloid {when n > 1) or 

(d) a k-fold cylinder^ i ^n — 1, over one of the 
hypersurfaces of types (a), (b), (c) in an n — k-dimensional 
affine space. 

Hypersurfaces of different types are affinely not equivalent. 

The last statement follows from the fact that 

(i) hypersurfaces of type (b) possess a vertex (a point for 
which the straight line connecting it to an arbitrary point 
of a hypersurface lies entirely on that hypersurface) while 
those of type (a) do not; 

(ii) hypersurfaces of types (a) and (b) have a centre of 
symmetry while those of type (c) have not; 

(iii) hypersurfaces of type (d) are cylinders while those 
of types (a), (b) and (c) are not. □ 

When K = C there is, up to affine equivalence, only one 
second degree hypersurface in each of the classes (a), (b), 
and (c). This means that the following theorem is true: 

Theorem 7 (classification of second degree hypersurfaces 
of an /^-dimensional affine space over the field C). In an 
n-dimensional complex affine space there are only two affinely 
nonequivalent second degree hypersurfaces for n i: an oval 
hypersurface consisting of two distinct points and a second degree 
cone representing two coincident points, and for n > 1 there 
are three such hypersurfaces that are not cylinders: an oval 
hypersurface, a second degree cone, and a paraboloid. The 
other second degree hypersurfaces in an n-dimensional (/i > 1) 
affine space are k-fold {i ^k ^n — 1) cylinders over the 
three {two for k — n — 1) indicated hypersurfaces in an 
n — k-dimensional affine space, □ 

When K = 01 (in the situation (C, 01)) equation (I) can 
be reduced to the form 

(!') X? + ,,, x^ — Xp+i— ... — 

where e = —1, 0 or 1 and I ^r ^n, and equation (II) 
to the form 

(IF) x\-\- ... ... 









Lecture 13 


137 


where — l,it being possible, owing to the mul¬ 

tiplication by —1 (and the change of the sign in the coordi¬ 
nate to assume without loss of generality in both 

cases that 0 ^ p ^ Y (and in case (I'), with P = y and hence 

with n even, also, in addition, that e ^ —1). 

When r = n the hypersurface (!') is called a nonsingular 
second degree hypersurface. When e 0 and p = 0 the non¬ 
singular hypersurface is called an ellipsoid, an imaginary 
one if 8 > 0 and a real one if 8 -< 0. When n = 2 we^^have 
an imaginary and a real ellipsoid, and when w = 1 we have 
pairs of imaginary or real points. 

When 8 = 0 the nonsingular second degree hypersurface 
is called a second degree cone. When p = 0 the second degree 
cone contains only one real point and for this reason it is 
usually called an imaginary second degree cone. 

When 8 0 and 1 ^ y the nonsingular second degree 

hypersurface is called an 8-hyperboloid. 

When n = 2 there exists only one hyperboloid—a hyper¬ 
bola and two cones—pairs of imaginary and real intersecting 
straight lines. When n = i, there are no hyperboloids and 
there is only one cone—a pair of coincident points. 

Just as in the projective space the second degree hyper¬ 
surface in a real-complex affine space is said to be s-planar 
if at least one 5-dimensional plane lying entirely in the 
hypersurface passes through any of its real points, but 
no 5 -f 1-dimensional plane is contained in the hypersur¬ 
face. 

It can be shown (do it!) that every 8-hyperboloid is 5-pla- 
nar, where s = p — 1, if 8 = 1, and 5 = p, if 8 = —1 and 
that every second degree cone is p-planar. 

When r cin hypersurfaces (I') are n —r-fold cylinders 
over nonsingular hypersurfaces in an r-dimensional space. 

When r = n — 1 the hypersurface (II ') is called a parab- 
oloid, an elliptical one if p = 0 (for w = 2 it is a parabola) 

and a hyperbolic one if 1 ^p ^ y. It can be shown (do 

it!) that every paraboloid is p-planar. 

When r dn — 1 the hypersurface (IT) is an w — r —1- 
fold cylinder over a parabola in an r-dimensional space. 



138 


Semester 2 


As in the case K = C, it is proved that ellipsoids togeth¬ 
er with hyperboloids, as well as cones, paraboloids and 
cylinders are affinely not equivalent. Paraboloids are affine¬ 
ly not equivalent, for they are p-planar when p are different. 
For the same reason, neither are cones, nor e-hyperboloids 
with the same e. The real and imaginary ellipsoids are 
in the obvious way affinely not equivalent to each other, 
nor are they to any e-hyperboloid, with a possible exception 
of the 1-hyperboloid with p = 1 (i.e. the 0-planar one). 
But there are hyperbolas among the sections of the latter 
hyperboloid by two-dimensional planes, which is not true 
for the ellipsoid. Therefore the ellipsoid and 0-planar 
1-hyperboloid are affinely not equivalent either. Finally, 
when 5 > 1, for the 5-planar 1-hyperboloid (corresponding to 
the value p = 5 -f 1) the maximum dimension of planes 
cutting it in an imaginary ellipsoid (i.e. not intersecting 
it in the real domain) equals (prove it!) w — s — 1 = n — p 
and for the ^-planar — 1-hyperboloid (for which p = s) 
a similar dimension equals (prove it!) s = p. Since in this 
situation the equation p = n — p is impossible (for when 
n — 2p the case e = — 1 is excluded under the hypothesis), 
we see that the 5-planar ± 1-hyperboloids are affinely not 
equivalent either. 

This proves the following theorem. 

Theorem 8 (classification of second degree hypersurfaces 
of an n-dimensional affne space in the situation (C, !R)). 

In the n-dimensional real-complex affine space there are only 
the following affinely nonequivalent second degree hypersur¬ 
faces that are not cylinders: 

(a) two ellipsoids {an imaginary and a real one)\ 

(b) one s-planar i-hyperholoid for any 5 = 0, 1, . . . 



(c) one s-planar — i-hyperboloid for any 5 = 1, . . ., wi, 

where m == -^ — I, if n is even, and m = if n is odd; 

(d) one p-planar second degree cone for any p = 0, 1, . . . 
. . [yj (/^^ p ^ 0 it is an imaginary cone); 


(e) one p-planar paraboloid for any p = 0, . . ., {for 

L. ^ 










Lecture 13 


139 


p = 0 it is an elliptical paraboloid and for p = i, . . . 
. . we have hyperbolic paraboloids). 

All the other second degree hypersurfaces are k-fold cylin¬ 
ders {i ^ n — 1) over the enumerated hypersurfaces in 
an n — k-dimensional affine space, □ 




Lecture 14 


The algebra of linear operators*Operators and mixed bilinear 
functionals • Linear operators and matrices • Invertible opera¬ 
tors* The adjoint operator* The Fredholm alternative* Invariant 
subspaces and induced operators 


Let us now return to the theory of vector spaces and consid¬ 
er the last type of bilinear functionals which we have not 
studied yet, mixed functionals 5: x, ^ B (x, ?), where 
X 6 5^, ? 6 (see Lecture 5).JIt turns out that these func¬ 
tionals are closely related to homomorphisms (see Defini¬ 
tion 5 of Lecture 3) for which W = TT. 

Definition 1. Homomorphisms from f" into 5^ are linear 
operators on TT, 

Thus the mapping 

( 1 ) k\T-^T 

is a linear operator if 

A (x + y) = Ax + Ay 


and 

A (kx) = kAx 

for any vectors x, y 6 ^ and any number fc 6 !K. 

The sum A -f B of linear operators A and B and the prod¬ 
uct kA of a linear operator A by a number /c 6 K are defined 
in the usual way: 


(A -f- B) X = Ax + Bx, 
(kA) X = k (Ax), 







Lecture 14 


141 


and are obviously linear operators. It can be immediately 
verified that under these operations the set Op (5^) of all lin¬ 
ear operators on T' is a vector space. 

Serving as the zero of that space is a zero operator 0 acting 
according to the formula 

0 (x) = 0. 

For operators the multiplication A, B AB, where, as is 
usual for mappings, the composition AoB of operators is 
regarded as their product AB, is defined as well as addition. 
Thus 

(AB) X = A (Bx) 

for any vector The operator AB is obviously linear. 

A trivial calculation shows that multiplication of opera¬ 
tors is associative: 

(AB) C = A (BC) 

(so that it is possible not to write parentheses in the product 
of any number of operators) and distributive over addition: 

A (B+C) = AB+AC. 

This means that the set Op (5^) is also a ring. 

The ring possesses unity which is an identity operator E: 
T leaving every vector x fixed: 

Ex = X. 

In general AB ^ BA, so that the ring Op (5^) is noncom- 
mutative (for n > 1). 

Multiplication of operators is related to their multiplica¬ 
tion by numbers & ^ K by the formula 

(2) (fcA) B = A (ftB) = k (AB) 

whose proof reduces to a trivial calculation. 

Rings which are at the same time vector spaces and in 
which relation (2) holds are called algebras. Thus, summing 
up all the foregoing we see that the set Op (T^) is an algebra, □ 
From relation (2) it follows in particular that 


(fcE) A = A (fcE) 



142 Semester 2 

for any operator A. Thus operators of the form &E, called 
scalar operators, are commutative with all operators. 

It turns out (try to show this on your own) that this 
property characterizes scalar operators, i.e. any operator 
commutative with every operator of Op (5^) is scalar. The 
algebra Op (5^) can thus be said to be noncommutative to a 
maximum extent (to an extent permitted by the structure 
of the algebra). 

Every operator A defines according to the formula 

A (x, 1) = I (Ax) 

some mixed bilinear functional A g TJ (T)- Conversely, for 
any mixed bilinear functional A the correspondence assign¬ 
ing to an arbitrary vector x G 5^ an associated covector 

Ax: g A (x, 1) 

of a space 5^' (i.e., by virtue of the identification (5^')'=5^, 
a vector of the space is a linear operator A G Op (T*). Since 
the constructed mappings A> A and A A are obviously 
reciprocal, each is bijective. Since these mappings obviously 
carry a sum over into a sum and a product by a number into 
a product by the same number, they are both isomorphisms. 
This proves that the vector spaces Op (^) and T\ (5^) are 
isomorphic in a natural way. □ 

As a rule we shall identify an operator with the corre¬ 
sponding bilinear functional. 

Let 01, .. ., be some basis chosen in a space f". Then 
for any vector x = we have 

(3) Ax —... 

where = Aei, . . ., a^ = Ae^. Conversely, for any family 
of vectors ai, . . ., a^ (3) uniquely defines some linear op¬ 
erator A for which a^ = Aei, . . ., a^ = Ae„. Thus, with 
the basis ei, .. ., fixed, the operators A G Op (5^) are in 
bijective correspondence with n-member families of vectors 
a^^, . . ., a^j. O 

To every such family there corresponds a quadratic mat¬ 
rix whose columns consist of the coordinates of vectors | 


I 







Lecture 14 


143 


• • •» ftn the same basis ei, . . e^): 




a\ 




... 


Since this obviously establishes a bijective correspondence 
between matrices and families a^, . . of vectors, we 

thus obtain a bijective correspondence between operators 
and quadratic matrices of order n. An automatic computation 
verifies that this correspondence is an isomorphism (carries 
a sum over into a sum and a product by a number into a 
product by the same number). 

Thus we have proved the following proposition. 

Proposition 1. The choice of a basis ei, . ., Bn of an n-dU 

mensional vector space TT over a field K establishes an isomor¬ 
phism between the algebra of operators Op {T') and the algebra 
of quadratic matrices of order n over K. □ 

Corresponding to an operator A under this isomorphism is 
a matrix A whose columns consist of the coordinates of 
vectors Aei, . . ., Ae^^ in the basis ei, . . ., e,i. 

Definition 2. The matrix A is called the matrix of the op¬ 
erator A in the basis ei, . . ., e^. 

Since = e’ (Ae^), we see that A is simultaneously the 
matrix of a mixed bilinear functional A. It follows (see 

Lecture 5) that the matrix A' = {a\^) of the operator A in 
any other basis . . ., is expressed by the formula 



A' = 


where C ~ 
However, 


(Ci*) is the transition matrix. 

(5) can be established without difficulty by 


direct computation: since 


er 


= c]^ei and e- 


if 


= 


we 


have ai'Bjf = Ae^ = cjAe^ = cyalej = cl^alcjej' and this 
is equivalent to (5). Of course this computation is in 
fact a repetition of the one in Lecture 5. 

To carry out the same computation in matrix notation we 
introduce the vector row matrices 


e (®1» • • •» ®n)> ® • 

Ae=(Aei, •••» Ae^), Ae'==(Aei', 


, A.Bfif). 


• • • 




144 


Semester 2 


Then (cf. formula (14) of Lecture 10 in [1]) 

e' = eC, e = e'C“i 

and 

Ae = e^, Ae' = e'A'. 

On the other hand, by linearity 

Ae' = A (eC) = (Ae) C. 

Therefore 

eM' = Ae' = (Ae) C = eAC = e'C-^AC, 


and hence A' = □ 

An operator A is said to be nonsingular if det A 0 (and 
respectively singular if det A = 0). It follows immediately 
from formula (5) that this definition is correct. 

Of particular interest are invertible operators, i.e. such 
operators for which there exists an inverse operator A“^ sat¬ 
isfying the relations 

AA-i = A-^A = E. 

The operator A is said to be left invertible if there exists an 
operator B such that 

BA = E, 

and right invertible if there exists an operator C such that 

AC = E. 

In arbitrary rings (or algebras) there exist invertible ele¬ 
ments that are only right or only left invertible. For linear 
operators the situation is quite different, however; an op¬ 
erator is invertible if it is at least left or right invertible. 
This is closely related to the (truly remarkable) fact that a 
linear operator is bijective if it is merely injective or sur¬ 
jective. (We remark here that although an invertible operator 
is obviously bijective, the statement that any bijective lin¬ 
ear operator is invertible, i.e. that invertible operator is 
linear, requires proof.) 

Proposition 2. For any linear operator AiT' the fol¬ 
lowing statements are equivalent. 





Lecture 14 


145 


1® The operator A is left invertible, 

2° The operator A is injective, i.e, Ker A = 0. 

3° The operator A is right invertible, 

4° The operator A is surjective, i.e, Im A = T', 

5° The operator A is invertible. 

6° The operator A is bijective. 

7° The operator A is nonsingular. 

8° For any basis e^, . . Bn of a space T vectors Aei, . . . 
. . Ae^i also constitute a basis. 

Proof. The equivalence of statements 7° and 8° follows 
immediately from the matrix rank theorem. It is therefore 
necessary to prove only the equivalence of statements 1° 
to 6° and 8°. To do this it is sufficient to prove the following 
diagram of implications: 



Implication 5° 1°. If A“^ is an inverse operator, then 

A-^A = E. 

Implication 1° 2°. If BA = E and Ax = 0, then x = 

= Ex = BAx = BO = 0. 

Implication 2°=^ 8 °. If the vectors Aej, . . Ae^ are 
linearly dependent, i.e. k^Ae^ + . . •+ = 0, where 

(fci, . . ., fcn) # (0» • • •» 0)» then for a vector e = + 

+ . . . + 0 we have Ae = 0. Consequently, if 

Ker A = 0, then the vectors Aei, . . ., Acn are linearly 
independent and hence constitute a basis. 

Implication 5° 3°. If A*"^ is an inverse operator, then 

AA"i = E. 

Implication 3° 4°. If AC = E, then Ay = x for any 

vector X 6 where y = Cx. 

Implication 4°=^8°. If for any vector x there exists 
a vector y 6 ^ such that Ay = x, then x = y^Ae-^ + . . . 

. . . + y^Aen- This proves that the family Aei, • • • 

. . ., Ae^ consisting of n vectors is complete. Hence it is a 
basis. 


10-01325 






146 


Semester i 


Implication 8°=^ 5°. In the basis e[ = Ae^, . . = 

= Ae-n the family of vectors K = ej, . . determines 

an operator B for which Be^ = bi, . . Ben = bn and hence 
(BA) 01 = ei, . . (BA) en = en, i.e. BA = E. For the same 
operator (AB) e' = e^, . . (AB) e^i = en and hence AB = 
= E. Consequently the operator A is invertible (and B = 
■= A-i).n 

The vector equation 

Ax = b 


can be written in coordinates as a system of n linear equa¬ 
tions in n unknowns. In terms of equations therefore the 
equivalence of statements 2° and 4° means that a system of n 
nonhomogeneous linear equations in n unknowns is compatible 
for any free terms if and only if the corresponding system of 
homogeneous linear equations has only a trivial solution. 

A direct extension of this beautiful statement to the case 
where the number of equations is not equal to that of un¬ 
knowns is shown by elementary examples to be false. To 
obtain such an extension it is first necessary to appropriately 
reformulate the statement. 

Let A 6 Op (5^). We associate with an arbitrary covector 
5 6 a functional A'| in 5^ by setting 

(6) (A'g) (x) = I (Ax), X 6 r. 

An automatic check shows that 

(a) the functional A'? is linear, i.e. is a covector of 5^'; 

(b) the resulting mapping A': f is linear, i.e. A' is a 
linear operator. 

Definition 3. The operator A! is called an operator adjoint 
to the operator A. 

If we introduce a natural pairing (x, S)= ? (x) between 
spaces TT and T' (see Lecture 4), then formula (6) defining 
the adjoint operator A' takes the form 

(x, A'g) = (Ax, g>. 

From the symmetry of the formula it immediately ensues 
that the mapping A A' of the space Op (5^) into a space 
Op (^') is involutory, i.e. 

A" = A. 


1 









Lecture 14 


14 ? 


In particular it follows that the mapping Ai—»-A' is hi- 
jective. 

Moreover, it is clear that 

(A + B)' = A' + B' and (kA)' = kA'. 

This means that the mapping At->A' is an isomorphism 
of the vector space Op (T’) into the vector space Op (T'). □ 
There is thus no natural isomorphism between vector 
spaces T and T' but there is between the vector spaces Op (T) 
and Op {r')\ 

With respect to multiplication, the mapping A !-► A' is not 
an isomorphism, since the order of cofactors is not changed: 

(AB)' = B'A'. 

Indeed, <x, (AB)' |> = (ABx, \) = <Bx, A'l) = (x, B'A'g). 
A linear isomorphism having this property is usually called 
an anti-isomorphism. 

The formula al = e’ (Aej) for the elements of the matrix 
of the operator A in the basis ei, . . ., e„ implies that 

Oi = <Aej, e’>. 

For the elements a'l of the matrix of the adjoint operator A' in 
the conjugate basis e^, . . we therefore have 

a? = <ei, AV‘) = <Aei, e^) 

and therefore ai = i.e. Ae^ = ale^. This does not mean, 
however, that the matrices of the operators A and A' coin¬ 
cide. Indeed, by definition, the columns of the matrix of an 
operator are the coordinates of vectors resulting from the 
application of the operator to the vectors of the basis. For 
the operator A this means (by virtue of the formula Ae^ = 

= alej) that the iih column of its matrix consists of the 
numbers a^, . . ., af. As to the operator A', however, the 

formula Ae^ — ale^ implies that the /th column of its mat¬ 
rix consists of numbers a{, . . ., a^, i.e. of^the same num¬ 
bers that the /th row in the matrix of the operator A con¬ 
tains. Thus the matrix of the adjoint operator A' in the 

conjugate basis e^, . . ., is a matrix resulting from 
transposing the matrix A of the operator A in the basis 

©If • * •f ©n» 


10* 



14S 


Semester 2 


Proposition 3. We have 

Ker A' = (Im A)°, Im A' = (Ker A)°, 

KerA = (ImA')°, Im A = (KerA')°. 

Proof, The inclusion ^ ^ Ker A' is equivalent to the fact 
that for any vector x 6 ^ we have (A'g) (x) = 0, i.e. 
I (Ax) = 0, an equation characterizing covectors of 
(Im A)°. Hence KerA' = (Im A')°. Replacing here A 
by A' we get Ker A = (Im A'Y and passing to annulets (and 
using Proposition 5 of Lecture 4) we get(KerA')° = 
= Im A and (Ker A)° = Im A'. □ 

In particular we see that Im A = ^ if and only if 
Ker A' = 0. In terms of coordinates this means (for the 
case m = n) that the system of nonhomogeneous linear equa¬ 
tions 

a^ Xji = 

(7) . 

is compatible for any free terms 6i, . . if and only if 

the system of homogeneous equations 

( 8 ) . 

^1 H” * • * ^ 

with a transposed matrix has only a trivial solution, □ 

But it is easy to see that this statement (usually called the 
Fredholm alternative) is true for any m and n too. Indeed, 
system (8) has only a trivial solution if and only if the rank 
r of the matrix of its coefficients is equal to m. On the other 
hand, by the Kronecker-Capelli^theorem system (7) is com¬ 
patible if and only if the rank r of the matrix of its coeffi¬ 
cients is not affected by addition of a column b of free terms, 
which obviously holds for any column b if and only if 
r = m. □ 

For the Fredholm alternative to be formulated in “oper¬ 
ator” terms also for m n one should extend the concept of 
adjoint operator to the case of an arbitrary homomorphism 
(p: T where W ^ This can be done without any 

difficulty by complicating insignificantly the notation and 







Lecfure 14 


149 


statements. The analogue of Proposition 3 remains, which 
just gives the Fredholm alternative in the general form. 

Definition 4. The subspace ^ of a space Y is said to be 
invariant under the operator A: T it 

Ax 6 ^ for any vector x 6 

Defined in this case is the operator 

A 1^ 60p 

acting according to the formula 

(A |^)x=-Ax, 

where the vector Ax at the right is regarded as an element of 
the subspace 5^. 

The operator A is called a restriction of the operator A 

to the invariant subspace It is also said to be induced by 
the operator A. 

Since dim 3^ < dim f", the operator A lends itself to 

study more easily than the operator A. At the same time, 
by studying it we can often obtain sufficiently much infor¬ 
mation also about the operator A itself. 

Especially satisfactory is the situation in the case (unfor¬ 
tunately, not always holding) where there exists a second 
invariant subspace (g complementary to the subspace (9^, 
i.e. where the space T is the direct sum T = 3^ ® (Sj oi the 
invariant subspaces 3 and Q,. In this case the operator A 
can be completely determined by the operators A | ^ and 

A 1^. Indeed, for any vector z = x -f y of a space T, where 
X 6 y 6 (S, we obviously have 

Az = (A |_^)x + (A |g)y. 

A complete reducibility of the operator A to the operators 
A I and A is clearly demonstrated by the matrix A = 

= (a^) of the operator A in a basis ei, . . ., of the space 
T such that 3 = [ei, . . ., e^] and (g = [cp+i, . . ., 

Indeed, since Ae^ = a\ej g 9^ for 1 ^ i ^ p, we have 
a{ = 0 if and p 

V 



150 


Semester 2 


Similarly, since Ae^ 6 ® for p + 1 < i ^ w, we have 
a| = 0 if p + and 

This means that the matrix A has a diagonal block form 
in the basis ei, . . 



where Ai is the matrix of the operator A | ^ in the basis 
Cl, . . ep and Ag is the matrix of the operator A |(g in 
the basis Cp_j_j, . • en* 

A matrix A of the form (9) is sometimes said to be decom- 
posed as a direct sum of matrices A^ and A^ (written A = 
= Ai © Ag). Thus every decomposition of a space ^ as a 
direct sum of invariant subspaces determines a decomposition 
of the matrix of the operator as a direct sum of the matrices 
of induced operators. 

In the case where the invariant subspace ^ has no invari¬ 
ant complement S (or the latter is not known) we can rep¬ 
resent the matrix A (by choosing a basis ei, . .e„ so that 
^ = [ei, . . epl) in triangular block form 



where A^ is the matrix of the operator A | 

From the fact that the subspace <9^ is invariant under the 
operator A we immediately see that the formula 

B (x + (9^) = Ax + ^ 

correctly defines in the factor space 9^79^ some (obviously 
linear) operator 

B: r/(9^ 

The operator B is also said to be induced by the operator A. 

If the basis ei, . . of the space 9^ is chosen so that 
(9^ = [ei, .. ep], then the cosets ep+i + (9^, . . + 9^ 

will obviously constitute a^basis of the factor space 
and the ^matrix of the operator B in that basis will be the 
piatrix B of (10). 


1 





Lecture 15 


Eigenvalues * Characteristic roots * Diagonalizable operators • 
• Operators with simple spectrum • The existence of a basis in 
which the matrix of an operator is triangular • Nilpotent 
operators 


The simplest invariant subspaces are one-dimensional sub¬ 
spaces. 

Definition 1. A vector x # 0 is said to be an eigenvector 
of an operator A if it generates a one-dimensional invariant 
subspace. 

It is clear that this is the case if and only if there exists 
a number % such that 

(1) Ax = Xx. 

Every number X for which there exists a vector x =/= 0 
that satisfies relation (1) (and hence is an eigenvector of the 
operator A) is called an eigenvalue of the operator A. An 
eigenvector x for which, for a given X, (1) holds is said to 
belong to the eigenvalue X. 

It is convenient to assume that belonging to every eigen¬ 
value X is also a zero vector 0 (which is not by definition an 
eigenvector). Then for any eigenvalue X the set of all 
vectors x^T belonging to it is obviously a .ubspace. It is 
called a proper subspace belonging to the eigenvalue X. Its 
dimension px = dim is called the geometric multiplic¬ 
ity of the eigenvalue X. By definition i ^px^n. 

For any eigenvector x 0 belonging to an eigenvalue X 
the one-dimensional invariant subspace [x] it generates 
lies entirely in 3^x- Conversely, each one-dimensional sub¬ 
space of the space 3^x Is invariant and hence, in particular, 




152 


Semester 2 


the space is decomposable as a direct sum of one-dimen¬ 
sional invariant subspaces. To obtain such a decomposition 
it is sufficient to choose an arbitrary basis in 

Geometrically the subspace can be characterized as a 
maximum invariant subspace on which the operator A (more 
precisely, its restriction is a scalar operator ?iE. One 

can also say that Sf^x is the kernel of the operator A — KE: 

= Ker (A — XE).' 


Indeed, the equation (A — ?iE) x = 0 is exactly equivalent 
to equation (1). □ 

We thus see that a number X is slu eigenvalue of an 
operator A if and only if the operator A — XE has a nonzero 
kernel, i.e. is noninvertible (singular); see Proposition 2 of 
the preceding lecture. In other words, X is an eigenvalue if 
and only if 

det (A — XE) = 0, 


where A is the matrix of the operator A in an arbitrary 
basis 01 , . . ., e^. 

The determinant 


det {A — XE) = 


a 


X 


a 


n 


a 


n 


.a^-X 


is, as is easily seen, a polynomial of degree n in X. This 
polynomial is independent of the choice of basis Ci, . . ., 
Indeed, in any other basis the matrix of the operator A has 
the form (see formula (5) of the preceding lecture) and 

C-^AC — XE = (A - XE) C 


and therefore 

det {C-^AC ~ XE) = (det C)“^ det {A - XE) (det C)-^ = 

= det{A — XE). □ 


Definition 2. The polynomial 

/a (X) = det {A — XE) 







Lecture 15 


153 


is called a characteristic polynomial of an operator A and its 
roots (in the corresponding extension over a field K) are 
called characteristic roots of the operator A. 

According to what has been said above any eigenvalue 
of the operator A is its characteristic root and conversely 
any characteristic root in the field K is an eigenvalue. □ 

A practical method for finding proper spaces is based 
on this statement (and on the fact that = Ker (A — XE)). 
First, solving the equation /a (?^) = 0, we find all its roots 
lying in K and then find a subspace for every such 

root Xi by solving a system of homogeneous linear equa¬ 
tions with matrix A — XiE. 

The multiplicity of the eigenvalue Xq as a root of a char¬ 
acteristic polynomial, i.e. a*"number n^^ such that the poly¬ 
nomial /a (A,) is divisible by (A, — f^rit is not by 

(A, — Is called the algebraic multiplicity of the 

eigenvalue It is easy to see that the algebraic multiplic¬ 
ity of an eigenvalue is at least as high as its geometric multi¬ 
plicity: 

Indeed, let p = pj^^ and let ei, . . be a basis of 
a space such that = [ej, . . ., Op]. In that basis 
the matrix of the operator A has the form 



and hence 

/a (k) = det {A — XE) = det (A^ — ?i£').det {B - XE). 
But is the matrix of the operator 

and hence det (Ai — XE) = (A^q — This proves that 
the polynomial /a (X) is divisible by (X, — and hence 

P < □ 

Remark. The operator A has a matrix of the form (2) 
in any basis for which the subspace = [ei, . . ., enl 
is invariant, with Ai the matrix of the operator A | ^ and 



154 


Semester 2 


B the matrix of an induced operator B: This 

proves that for any invariant subspace there is a decom¬ 

position 

/a (^) = /a 1 /b (^) • 

Let Xi, . . Xjyt he distinct eigenvalues of the operator A 
and let 

= ^m = ^Xm 

be the proper subspaces belonging to them. 

Proposition 1. The sum 

of subspaces ^ direct sum, i,e. the equation 

(3) Xi + . . . + Xm = 0, 

where Xi 6 6 holds if and only if 

Xi = 0, . . X;n = 0. 

Proof, We proceed by induction on m. For m = i 
the statement is obvious (and meaningless). Suppose we 
have already proved that the sum olm — 1 spaces . . ., 
. . ., Is direct. By applying to (3) the operator A we 

obtain the relation 

(4) XjXi + . . . + XmXm = 0. 

On multiplying (3) by and subtracting from (4) we then 
get 

(^1 ^m) Xi 4 “ ... 4 “ (^m -1 ^m) 

By induction hypothesis it follows that 

(^1 ^m) Xx = 0 , . . (^m -1 ^m) ^m-l ~ ^ 

and hence (since under the hypothesis Xi — Xjn ^ 0, . . ., 
• • *7 that 

Xl 0 , . . . y X 7 J 2 -I 

But then, according to (3), also X;,^ = 0 . □ 





Lecture 15 


155 


Let there exist (distinct) eigenvalues 


(5) 

such that 



• • • j 


X 


mj 


(6) ... = r 


and hence 

( 7 ) Pki + . •. + Pim — 


It is easy to see that numbers (5) exhaust all the eigenvalues 
of the operator A, Indeed, for any other eigenvalue Xq the 
subspace would form with 5^, according to Proposi¬ 
tion 1, a direct sum, which is impossible. □ 

On choosing a basis in each of the spaces ...» 

TYh 

we obtain a basis of a space T' consisting of eigenvectors. 
The matrix of the operator A in that basis is diagonal: 





0 


0 


'm 


and its diagonal elements are the eigenvalues (5), each 
Xi repeated times. 

Conversely, let there exist in a space T' a basis in which 
the matrix A of the operator A is diagonal. Then the vectors 
of the basis are eigenvectors and the diagonal elements 
of the matrix A are the eigenvalues of the operator A. 
Let Xj, . . Xm be all distinct diagonal elements of the 
matrix A and let the element Xi, i = 1, . . ., m, be re¬ 
peated Qi times. Also let j = 1, . . be a subspace 
of the space 5^ generated by the vectors of the basis belong¬ 
ing to the eigenvalue Xf. Then dim = qu 

(Si© ... ®&m=T 


and §,i d Therefore, in particular, 

(9) 9 i+ ... = R and . . ., 

But according to Proposition 1 the sum -|~ . . . -f (9^^^ 
of the subspaces ...» is their direct sum and 

hence has dimension Pxi + . . . Px . Therefore pxi + . * » 



156 


Semester 2 


. . . + whence by virtue of relations (9) it fol¬ 

lows that 

i.e. that 

Consequently, for the subspaces decom¬ 

position (6) holds. 

Since the existence of a basis in which the matrix A 
of the operator A is diagonal is equivalent to the decompos- 
ability of the space as a direct sum of one-dimensional 
invariant subspaces, this proves the following proposition. 

Proposition 2. For any linear operator A\ ‘T the fol¬ 
lowing statements are equivalent: 

1° There exist eigenvalues Xi, , , such that 

... 

2° The space T" is a direct sum of one-dimensional subspaces 
invariant under the operator A. 

3° In the space IT there exists a basis consisting of eigen¬ 
vectors^ i.e. a basis in which the matrix of the operator A is 
diagonal. □ 

The eigenvalues Xi, . . Xj^ appearing in 1° (and implic¬ 
itly in 2°) exhaust all the eigenvalues of the operator A. 
Every basis in which the matrix of the operator A is diag¬ 
onal is obtained by combining the bases of the spaces . . ., 
. . ., , so that for any eigenvalue Xi in that basis 

there are exactly vectors belonging to Xi. 

Definition 3. An operator A is said to be diagonalizable 
if Statements 1° to 3° hold for it. 

Computing the characteristic polynomial of the diag¬ 
onalizable operator A in the basis consisting of eigenvectors 
we get immediately 

fA(X)^{X-x,Y\..(x^xJ^, 

where Xi^ . . .^ Xm are the eigenvalues of the operator A 
and Pi = p;^,, . . ., Pm = Px aro their geometric multi- 

TYh 

plicities. This proves that for a diagonalizable operator any 





Lecture 15 


157 


of its characteristic roots %q is in the field K (and hence is an 
eigenvalue) and that its algebraic multiplicity n^^ coincides 
with its geometric multiplicity □ 

It turns out that this necessary condition for diagonaliz- 
ability is a sufficient condition as well, so that the follow¬ 
ing theorem holds. 

Theorem 1. A linear operator A is diagonalizable if arbd only 
if any of its characteristic roots is in the field K and n^^ == 

Proof, It is necessary for us to prove only the sufficiency 
of this condition. 

Let ^ 1 , . . ., be all characteristic roots of the oper¬ 
ator A. By the hypothesis they are in K and hence are 
also eigenvalues. Therefore subspaces are 

defined the dimension of whose sum (a direct one as we 
know) is 

PKi + • . • + Pkm = + • • • + 

(the sum of the multiplicities of all roots of a polynomial 
is equal to its degree). Hence © . . . © = T*, O 

TYh 

Definition 4. The set of all characteristic roots of an 
operator A is called the spectrum of the operator. The spec¬ 
trum is said to be simple if every characteristic root Xq is 
a simple root of the characteristic polynomial, i.e. if nj^^ = 1. 

The spectrum is said to lie in K if all characteristic roots 
lie in K. 

Proposition 3. Any operator with simple spectrum in K 
is diagonalizable. 

Proof, Since 1 ^ Px ^ for = 1 we necessarily 
have px, = 1 (and hence p^, = %)• D 

This diagonalizability condition is not necessary, but 
it is convenient for a practical check. 

Let SP be an arbitrary invariant (under an operator A) 
subspace of a space T. Since (see the remark above) the 
characteristic polynomial /b (X) of the induced operator B: 

-^Tt^ divides the characteristic polynomial /a (^) 
of the operator A, each characteristic root of the operator B 
is a characteristic root of the operator A of at least the same 
algebraic multiplicity. In particular, if the spectrum of 



158 


Semester 2 


the operator A lies in K, so does the spectrum of the oper¬ 
ator B and hence there exists at least one eigenvalue* 
for B. Let Xq + ^ be the corresponding eigenvector of tiie 
operator B. The equation B (xq + <^) = (xo + Im¬ 
plies that Axo = ^oXo + where Bq 6 from which 
it follows that the subspace ($ generated by the subspace <9^ 
and vector Xo (i.e. consisting of all vectors of the form 
kxQ + a, where fc 6 K and a 6 note that Xq ^ 3^) is 
invariant under A. Since dim 61 = dim (9^ + 1, this proves 
the following proposition. 

Proposition 4. If the spectrum of a linear operator A: 
5^ 5^ lies in K, then any of its invariant subspaces is con¬ 

tained in an invariant subspace of dimension higher by unity, □ 
Consequently, beginning with the subspace = 0, we 
can construct an ascending chain of invariant subspaces 

0=^oC=(^ic: ... (=:&^n=T 


of dimensions 0, 1, . . ., It is clear that in the correspond¬ 
ing basis ei, . . ., en of the space 5^, i.e. in a basis such 
that = [ei, . . ., Of] for any i = 1, the matrix 

of the operator A is a triangular matrix 

/ ^1 * \ 



0 


'm 


whose diagonal elements are the eigenvalues of the operator 
A, each repeated as many times as is its multiplicity. This 
proves the following proposition. 

Proposition 5. For any linear operator A: with 

spectrum in K there is in the space T* a basis in which the 
matrix of the operator is triangular. 

When K = C this* corollary applies of course to any 
linear operator. 

We shall first obtain a more precise result for operators 
of one special class. 

Definition 5. An operator A (matrix A) is said to be 
nilpotent if there exists a natural number m such that A^ = 0 
(respectively = 0), The smallest of such m is called 
the degree of nilpotency of the operator (the matrix). 







Lecture IS 


159 


It is easy to see that all eigenvalues of a nilpotent operator 
are equal to zero. Indeed, if Ax = then A^x — 
for any k and hence when A”* = 0 and x 0, necessarily 
= 0, i.e. X = 0, D 

Therefore it is impossible for a nonzero nilpotent oper¬ 
ator to be diagonalizable. 

One example of a nilpotent operator is an operator for 
which there exists a vector e = 0 such that the vectors 

e, A^, . . A’^’^e 

constitute a basis of a space 5^, and A’^e = 0. In the basis 

=■ A ef * * * 9 ®n— 1 Ae, e 


the matrix of this operator is 



Operators of such a form are called cyclic operators. 

For any vector x = x^ei -f . . . -f x'^Bn and any m ^ n 
we have A’^x = + . . . + x'^Bn^m and, in particular, 

A^ = 0. Thus a cyclic operator is nilpotent and its degree 
of nilpotency equals n, □ 

When n = I the cyclic operator is zero. 

It turns out that an arbitrary nilpotent operator reduces 
to cyclic operators. 

Proposition 6. For any nilpotent operator A: f" there 
exists a decomposition 

of the space T' as a direct sum of invariant subspaces on each 
of which the operator A induces a cyclic operator. 

We shall prove this proposition in the next lecture. 




Lecture 16 


Decomposition of a nilpotent operator as a direct sum of 
cyclic operators •Root subspaces* Normal Jordan form •The 
Hamilton-Cayley theorem 


Let A: 5^ be an arbitrary nilpotent operator and let 

i — Oy 1 , m, 

where m is the degree of nilpotency of the operator A. Since 
A^+ix = A^ (Ax), we have 

0 = CZ . . . d d d ... d (9^1 CZSJq = T. 

(A® = E by definition, and hence = T even for A = 0). 

By construction A 0 ^ i <i m, from which 

it follows in particular that ^ A. Hence for 

any basis 

(1) ..., p,n_i = dim(9^Tn-i 

of the space 3^m-i the relations 

(2) Ae^«-‘)=0, Ae('”-i) = 0 


hold. In addition there are in the space #m -2 vectors 


e(™-2), 


0(m-2) 


such that 


Lecture 16 


161 


It turns out that the vectors 


(4) 


^ ^m-1 


^m-l 


of the space #^ 1-2 linearly independent. Indeed, il 

••• +A:pe^”*-‘> + iie(™-2)+ ... + Zpe^^-2)^ 0, 


where p = Pm-ii then by applying to this equation the 
operator A we obtain by virtue of (2) and (3) the equation 

Zief-‘)+ ... +Zp^™-i) = 0, 


which is true (since (1) is a basis of the subspace 
only when = 0, . . ., Ip = 0. But then 

... + A;pe(p”*-‘) = 0 


and hence for the same reasons k^ — 0, ...» ftp = 0. □ 
Therefore we can extend the vectors (4) to some basis 

‘ ’ ’ Pm-i ’ 

(5) e(«-2), ..., e(>»-2), ..., e<’«-2), 

^m-1 ^m-2 

Pm-2= dim dim 


of the space is easy to see that the complementary 

vectors 



e 


(m-2) 


. • ., 


0(m- 2) 
^m-2 


can be taken from the kernel Ker A of the operator A, i.e. so 
that we have 



Ae(™-2), =0, 

Pni-i+‘ 


Ae("»-2)=0. 

Pm-2 


Indeed, since the vectors (1) constitute a basis of the space 
= A (^ro-a)) we have for an arbitrary choice of 
vectors (6) and any i = 1, ...» Pm-a — Pm-u 


Ae^'“ 72 ) = a:|e(”*-‘)+ 


t 1 


^a;PeOn-i)^ 


where 


P Pm-V 


11-01325 







Semester i 


162 


Therefore, by replacing the vectors with the vectors 


e(TO7 2)_a;ie(m-2)_ 
P + 1 1 1 


a;Pe(m- 2 ) 
1 p 


we satisfy conditions (7). □ 

Since A (#^- 3 ) = there exist in the subspace 

vectors 


p(tn-3) p(7n-3) 

^ ^m -2 


such that 


AeC^-s) 


1 p_ . Pm-i 


m -2 


It can be shown by the same method as that used for the 
family of vectors (4) that the vectors (5) and ( 8 ) constitute 
together a linearly independent family. Indeed, by applying 
the operator A to an arbitrary linear combination of these 
vectors we obtain by virtue of (2), (3), (7), and (9) a linear 
combination of vectors (5). The corresponding coefficients 
are therefore zero and hence all that is left of the entire 
combination is a combination of vectors (1) and (7) from 
the kernel. Since these vectors are linearly independent, 
the remaining coefficients are also zero. □ 

This linearly independent family can be extended to the 
basis 

eh^-s), eh«-2), 

1 ^m-i ’ Pm-2 ^m-3 

the argument employed for the vectors ( 6 ) similarly showing 
that the complementary vectors 


^m—2^ ^ Pm- 


-3) 


m-s 


can be taken from the kernel of the operator A, i.e. so 
that we have 

Ae(»»“3). = 0, AeJ/""3) = 0, 


^m-s 






Lecture 16 


163 


Continuing this construction step by step we finally 
obtain in the space ~ TT a basis whose vectors are 
arranged in a stepped array of the form 


p(m- 1) 

» • 

A(m- 2) 

, . 


p(m- 3) 

’ 


p(m-1) 

’ ^T) ’ 


^m-1 ^m-2 

p(m-3) p(m-3) 

^m-l ^m-2 


e(w-3) 

* *•’ » 

^m-3 


e(0), e(«) , e(0) , eW , 

^ ^m-1 ^m-2 ^m“3 


having the property that under the operator A the vectors 
of each column mount a step while remaining in the same 
column (the uppermost vectors becoming zero). 

This property means by definition that the vectors of 
each column generate an invariant subspace, the restriction 
of the operator A to the subspace being a cyclic operator 
(the lowest vector in the column obviously serves as a vector 
e for that operator—see the preceding lecture). Since the 
space T is a direct sum of these invariant subspaces, Prop¬ 
osition 6 of the preceding lecture is thus completely 
proved. □ 

We now return to arbitrary operators. 

The proper subspace can be defined as a maximum 
subspace of a space f" on which the operator A — XE is 
equal to zero. By analogy we introduce the following defini¬ 
tion. 

Definition 1. A maximum invariant subspace of a space 
f on which the operator A — XE is nilpotent is called 
a root subspace of the operator A belonging to an eigenvalue X. 

We explain this definition. 

A vector x 0 of a space f" is said to be a root vector 
belonging to an eigenvalue % if there exists an integer m ^ 0 
such that 


(A - XE)^ X = 0. 

It is clear that for any /c ^ K the vector kx is also a root 
vector belonging to % (or zero). It is easy to see as well that 
a similar statement is also true for a sum of root vectors 

11* 







164 


^emesfer i 


belonging to the same eigenvalue, for if (A — XE)^^ Xi = 0 
and (A - XE)^^ = 0, then (A - XE)^ (xi + x^) = 0, 

where m = max {rrii, 7712 )- This means that the zero vector- 
supplemented set of all root vectors belonging to a given 
eigenvalue A, is a subspace of the space T'- This subspace is 
exactly the subspace *5?^ described in Definition 1, since 
it is obviously invariant under the operator A — XE and 
hence under the operator A. 

It is clear that any subspace of a space T invariant 
under an operator A is also invariant under every operator 
of the form A — jiE, |x 6 IK. In particular the subspace 
is invariant under any operator of the form A — |xE, jiE K 
and hence the operator 

(10) (A - pE) I 

is defined. 

Proposition 1. When \i X the operator (10) is invertible. 

Proof, If 

(A — |xE) X = 0, 

where x 6 then Ax = |xx and therefore (A — A,E) x = 
= (|x — A,) X. Hence either the vector x is zero or the num¬ 
ber |x — A, is an eigenvalue of the nilpotent operator 

(A—A,E) I 


But, as we know, all eigenvalues of an arbitrary nilpotent 
operator are zero. Since by the hypothesis ii ^ Xy it fol¬ 
lows that X = 0. This proves that the kernel of the operator 
(10) contains only a zero vector. Therefore (Proposition 2 
of Lecture 14) the operator (10) is invertible. □ 

Now we are in a position to prove the analogue of Prop¬ 
osition 1 of the preceding lecture for root spaces. 

Let Xiy . . ., X^ be distinct eigenvalues of an operator A 
and let 

= = 

be root subspaces belonging to them. 

Proposition 2. A sum 

+ • • • + 


i 









Lecture 16 


165 


of subspaces 3ijn is direct^ i.e. the equation 

Xi + . . . + x,„ = 0, 

where Xi 6 G holds if and only if 

Xj^ Oj • • • y 

Proof (cf. the proof of Proposition 1 in the preceding 
lecture). For m = i the proposition is obvious. Suppose 
it is already proved for m — 1 root subspaces. Since Xm 6 
6 there exists a number s such that 

(A — X^E)®x = 0. 

Therefore 

Yl + . . . + Ym+l = 0? 

where 

Yi — (A X^, • • • 7 Ym - 1 “ (A Xm - 1. 

Since the subspaces . . ., are invariant under 

the operator A — A^^^E, we have yi 6 Si\, ...» Ym-i 6 ^m-i 
and hence by induction hypothesis 

Yi = 0, . . ., = 0. 

Since according to Proposition 1 the operator A — XmE 
on the subspaces . . ., Is invertible, it follows 

that 

Xi = 0, . . ., Xm^i == 0 

and hence x^^ = 0. □ 

An advantage of root subspaces over proper subspaces 
manifests itself in the following proposition. 

Proposition 3. For any eigenvalue X of an operator A the 
dimension of a root subspace equal to the algebraic 

multiplicity of the eigenvalue: 

dim My, = ny,- 

Proof, Let Ai = A | and let B be an operator TlM% 

induced by the operator A. Then /a == /ai/b* 
Consequently, if dim < n%, then the number is a root 
of the polynomial /b. Hence, since A, 6 !K, it is at the same 
time an eigenvalue of the operator B. 



166 


Semester 2 


Let Xo + be the corresponding eigenvector. Then 

Axo = X,Xo + ao, 

where ao 6 ao = (A — A,E) xq. Therefore there exists 
m such that (A — ?^E)”^ao = 0. But then 

(A-XEr+^Xo = 0 

and hence x© 6 which is impossible. The obtained con¬ 
tradiction shows that dim □ 

It follows from Proposition 3 that if the spectrum of 
the operator A is in K and are all the eigen¬ 

values (characteristic roots) it has, then 

dim © ... © == . .. + == w 

and hence 


© . • . © .^^771 - ^ * 

Thus the following theorem holds. 

Theorem 1. For any linear operator A: 5^* whose 

spectrum lies in K the space T' is a direct sum of the root 
subspaces of that operator: 

( 11 ) ... □ 

To say that an invariant subspace M of a space T is 
a root subspace is the same as to say that the restriction 
of an operator A to that subspace is a sum A,E + B of a scalar 
operator XE and some nilpotent operator B. But according 
to Proposition 5 of the preceding lecture, for an operator B 
there exists a decomposition of a subspace ^ as a direct 
sum of subspaces invariant (under B and hence under A) 
on each of which the operator B induces a cyclic operator. 
On carrying out this decomposition for any root subspace 
of (11), we obtain a decomposition of a space T' as a direct 
sum of invariant subspaces on each of which the operator A 
induces an operator of the form 

(12) ;.E + c, 

\yhere A. G K and C is some cyclic operator. 





Lecture 16 


167 


Definition 2. A matrix of the form 

X i 0 ... 0 

0 X 1 0 ... 0 

0 0 X 1 ... 0 


0 0 .... X i 
0 0. X. 

is called a Jordan cell. A matrix A is said to have a normal 
Jordan form if it is a direct sum of Jordan cells (each in 
general with a different A,). 

Since the matrix of any cyclic operator has in an ap¬ 
propriate basis the form (11) from the preceding lecture, the 
matrix of the operator (12) in the same basis is the Jordan 
cell (13). By combining all the bases of the corresponding 
subspaces, therefore, we obtain a basis of the space T in 
which the matrix of the operator A has a Jordan form. This 
proves the following theorem. 

Theorem 2 (reduction to Jordan form). For any linear 
operator kr. ¥* whose spectrum lies in K, there exists 

a basis of the space T* in which its matrix has a normal Jordan 
form. □ 

It turns out that up to the sequence of cells the normal 
Jordan form of the matrix of the operator is uniquely deter¬ 
mined^ i.e. the number of Jordan cells, their size and the 
corresponding numbers X are the same for all bases in which 
the matrix of the operator has a normal form. As regards 
numbers X this is obvious (since they are the eigenvalues 
of the operator). The statement concerning the number 
and size of Jordan cells we shall not prove. 

Remark. The uniqueness of the normal Jordan form of 
a matrix follows immediately from the fact that the number 
of Jordan cells of order k corresponding to an eigenvalue X 
is expressed by the formula 

r {A |- XEf-^ — 2r (A — XEf + r (A — XEf^^, 

where rC is the rank of a matrix C. 

We leave the proof of the formula to the reader as a use¬ 
ful exercise. 




168 


Semester 2 


We stress that when K = C the condition on the spec¬ 
trum of an operator A in Theorem 2 holds for any operators, 
so that over the field C every linear operator reduces to Jordan 
form, □ 

Here is one example of applying the results obtained. 

Let 


f{x)=aQX^-\-a^x^-‘^-\- ... 4-a^ 

be an arbitrary polynomial over a field K. Then for any 
operator A (any matrix A) the operator 

/ (‘^) ”1“ fllA”* ^ ... -(- flyn® 

(the matrix / (A) = aoA^ -f aiA^"^ + - * • + ^mE) called 
a polynomial of the operator A (of the matrix A) is defined. 

It is obvious that any subspace SJ a invariant under 
the operator A is also invariant under every operator / (A). 
Moreover 

(14) /(A)|^=/(A|^). 

In particular, for any operator A an operator 

/a(A) 

is defined, where /a(^) = det {A — XE) is the characteristic 
polynomial of the operator A. Let us compute this operator. 
First let 

(15) A = XoE + C, 

where C is a cyclic operator. Then fx (A,) = {Xq — Xy^ and 
= 0. Therefore 

/a (A) ^ (XoE - A)^ = (- C)^ = 0. 

Now let the operator A (with a spectrum in K) be^arbitrary 
and let 

be a decomposition of the space ^ as a direct sum of in¬ 
variant subspaces on each of which the restriction A^ = A| 



Lecture 16 


169 


of the operator A has the form (15). Then, according to 
what has been proved, 

(16) /a, (Aj) = 0. 

But, as we know, every polynomial /a^ divides the poly¬ 
nomial /a (moreover, the polynomial f\ is easily seen to be 
a product of polynomials /ai, . . It therefore fol¬ 

lows from (16) that 

/a(A0 = 0. 

Hence (see formula (14)) 

/a (A) i-=l, .. N. 

Thus the operator /a (A) has the property that for any i = 
= 1, . . 7\r its restriction to a subspace is zero. Con¬ 

sequently this operator is equal to zero on the sum of these 
subspaces as well, i.e. on the entire space T. 

This proves the following theorem. 

Theorem 3 (Hamilton-Cayley theorem). £'i;^ry 
nuls its characteristic polynomial: 

/a (A) = 0. □ 

We have proved the theorem for operators whose spectrum 
is in K and thereby, in particular, for any operators over 
the field C. In the next lecture we shall prove it (over the 
field E) for operators with an arbitrary spectrum. 












Lecture 17 


Complexification of a linear operator • Proper subspaces belonging 
to characteristic roots • Operators whose complexification is 
diagonalizable 


The results of the preceding lecture were obtained on the 
assumption that the spectrum of an operator A lies in the 
ground field K. This condition is automatically fulfilled 
when K = C, but] already for K = H it substantially 
restricts the applicability of the results. In this lecture 
we shall find out what results for operators failing to satisfy 
the condition. For simplicity we shall consider only the 
geometrically interesting case K=!R, although by introduc¬ 
ing some insignificant and inessential complications this 
can be extended to the case of a quite arbitrary field K. 

Recall (see Lecture 19 in [1]) that from any vector space 7' 

over the field H we can construct a vector space over 

the field C called a complexification of the space This 

space possesses the property that each of its vectors z can 
be uniquely represented asj 

z = X + iy. 

where and y For every linear operator A: TT 

therefore the formula 

(x + iy) = Ax + iAy 

correctly defines some operator Jp : Since for 

any number a -f jfc 6 C and any vector z = x + 6 

{a -f ib) (x + iy) = (ax — by) + i (ay -f &x), 




Lecture 17 


171 


we have 

((a + ib) (x + iy)) == A (ax— by) + iA {ay + 6 x) = 

— (aAx — 6 Ay) + i (aAy + bAx) ■= 

= {a-\- ib) (Ax H~ iAy) = 

= {a^ib) A^ (x + jy), 

so that A^ (cx) = cAPx, It is still easier to verify that 

A^ (zi + Z2) = A^Zj + A^Zg 

for any vectors z^, Zi 6 Hence the operator Ap is lin¬ 
ear. □ 

Definition 1. An operator A^ is called a complexification 
of an operator A. 

As we know (see Proposition 1 of Lecture 19 in [1]; 
recall that in terms of the proposition {Y"^)^ = 5 ^), any 

basis 01 , . . of a space Y* is also a basis of a space 5 ^^, 
It follows that in every such (“real”) basis the matrix of 

an operator A^ coincides with the matrix of an operator A. 
In this sense the matrix of an operator is not affected when 
the operator undergoes complexification. Hence, in particular, 

the operators A and A^ have the same characteristic poly¬ 
nomial: 

( 1 ) 

It follows immediately, among other things, that the 
Hamilton-Cayley theorem (Theorem 3 of the preceding lecture) 
is true for any operators A: f" Indeed, the proof implies 

that the theorem is true for the operator A^ and the oper¬ 
ator /a (A) is obviously a restriction of the operator /a (A^) = 
= / 0 (A^) to r = Re Therefore, since / g (A^) = 0, 

we have /a (A) = 0. □ 

In view of (1) the operators A and A^ have the same 
characteristic roots. These are all the eigenvalues of the 

operator A^, but only those of them that are real are eigen¬ 
values of the operator A. 



172 


Semester 2 


If the operator A is nilpotent, then the operator A® is 
also nilpotent (and has the same degree of nilpotency) and 
therefore all its eigenvalues are equal to zero. 

Since these eigenvalues exhaust all the roots of the poly¬ 
nomial / 0 = /a» this proves that /a (A,) = (—for 

any nilpotent operator A. In terms of matrices this means 
(we replace X by —A,) that for any nilpotent matrix A we 
have the identity 

det {A + XE) = A,^. □ 

It is apparently very difficult to prove this “purely matrix” 
statement by a straightforward calculation of the deter¬ 
minant. 

By virtue of the Hamilton-Cayley theorem it follows that 
in an n-dimensionalspace the degree of nilpotency of an artitrary 
nilpotent operator does not exceed n, □ 

These beautiful statements show what a powerful tool for 
proving theorems is a quite trivial, one would think, method 
of complexification. 

Now we shall apply it to the investigation of character¬ 
istic roots. 

It is obvious that for any subspace S of a space 
the subset Re S of all vectors of 7^ of the form Re z, where 
2 G 6!, or equivalently (since Im z = Re (— iz)) of the form 
Im z, where z G is a subspace of the space T (if Xi = 

= Re Zj, Xg = Re Z 2 , then xi + Xg = Re {z^A-Z 2 ) and 
A: Re z = Re /cz for any A: g 

Similarly, for any subspace ^ of the space T the set 
of all vectors of the form x-f jy, where x, y G (9^? is 
a subspace of the space (it is none other but the span 
of the subspace in the space It is clear that 

Re 

for any subspace 6 ^ cz 7^. 

Note that any basis ei, ...» ep of a subspace ^ (over IR) 

is a basis of the subspace (over €) as well. 

Now let A; V ' he an arbitrary linear operator on T* 

and A^ ; ^^->^^its complexification. Consider an arbi- 






Leclure It 


173 


trary characteristic root X of the operator A. It is an eigen¬ 
value of the operator and therefore the corresponding 

proper subspace is defined in 5^^. 

Suppose first that the root X is real. Then it is an eigen¬ 
value of the operator A and the corresponding proper sub¬ 
space is defined in the space T', 

If we are given some system of n homogeneous linear 
equations in n unknowns whose coefficients are real and 
constitute a matrix of rank r, then the solutions of the 
system form in a subspace on dimension n — r, 
so that each solution is a linear combination of some n — r 
linearly independent solutions constituting a basis of that 
subspace. As already said, it is customary to call this basis 
a fundamental system of solutions. 

The same system of equations may be regarded as a system 
with complex coefficients and its solutions sought in = 

_ Then every fundamental system of solutions 

remains a fundamental system of solutions but, to obtain 
all solutions, one has to take linear combinations of solu¬ 
tions to this system not with real but with any complex 
coefficients. In terms of the notation introduced above 

this means that is the subspace of solutions of a given 
system of equations in the space 

These general considerations apply in particular to the 
subspace the coordinates of whose vectors in an arbit¬ 
rary basis 01 , . . ., of a space T satisfy a system of homo¬ 
geneous linear equations with a real coefficient matrix 
A — XE, As we know, the same vectors of ei, . . ., 

constitute a basis of the space and in that basis the co¬ 
ordinates of the vectors of are defined by the same system 
of equations. This proves that for every real characteristic 
root X of the operator A we have 


and therefore also 

( 2 ) 

Now let X be nonreal. In that case we define a subspace 
c= r by (2). 



174 


^emesfer 2 


ro^sT ’iu^ defined for any characteristic 

?L ) 4’ notation in the case of real K 

naving the former meaning. 

r subspace 

cl. 1 ^ operator A belonging to a characteristic root X. It 

should be remembered that its vectors are eigenvectors of 
the operator A only when K is real. 

It is clear that each of the spaces is invariant under A 

With K real, iP^ = a^. what is SPf equal to when K 
IS nonreal? 

To answer the question we remark that since the coef- 
cients of a characteristic polynomial /a are real, besides 

„ complex conjugate number K. The coordi¬ 

nates of the vectors in the corresponding proper subspace 

(S,i of the operator A'^ are solutions of a system of linear 

equations with a complex conjugate matrix A ~ IE and 
hence are obtained from the coordinates of vectors of the 
subspace by changing to complex conjugate numbers. 
In coordinal^-free terms this means that if z = x -f iy ^ 

6 then z = X — iy g We can write this fact as 

^•X= 

a convention but a clear one. 

It follows that Re = Re i.e. that 

/3 ^11,Other hand, since X^X, the sum of subspaces 
ax and % IS (Proposition 1 of Lecture 14) their direct 

sum ^x ® (gj^. If Cl, . . ., Cg is a basis of the subspace 

^x, where q = dim then the vectors of Ci, . . ., e , 

^ obviously constitute a basis of the space 
(Six ®@ix’ But then so do the vectors 

Ree,_.Sl^, 


Reeg = 


®g + ® 


q 


Im eg = 


2i 






Lecture It 


Its 

The vectors Re ei, Im ei, . . Re e^, Im are real 
by construction, i.e. lie in If oT* c: 5^ is a 2g-dimensional 
subspace of the space T, generated by these vectors, then 
by definition 

= (Si, © (Sx* 

But it is easy to see that under the correspondence 
Re (S a sum of subspaces becomes a sum, i.e. 

Rc (fii “f* ^ 2 ) “ ”1“ (^2 

for any subspaces Si, S 2 ^ 5^^- Therefore 

— Re = Re (Sx © =Re -1-Re Sy = ^ 

This proves that for any nonreal characteristic root X of 
the operator A the equation 

is valid, □ 

The example of the subspaces Si = Six and S 2 = 
shows that under the correspondence S Re S a direct 
sum does not necessarily become a direct sum. It is easy 

to see, however, that if Si = cLnd S 2 = 

(3) Re (Si © S 2 ) “ ® 2 * 

Indeed, it is clear that 

(^l©^2)®=^f ©^f- 

Therefore, by applying Re we obtain (3). □ 

We now prove for spaces the analogue of Proposi¬ 
tion 1 of Lecture 14, 

Proposition 1. Let A^i, . . ., %rn be characteristic roots of 
an operator A {whether real or not) such that 

■ /"■ Xj and Xj^ j — 1, ,, •, 

with i j. Then the sum of the subspaces • • •» 
is their direct sum: 

(4) 


^ = © ... 




176 


Semester 2 


The example of the subspaces and shows that 

the condition = 7 ^ Kj is essential here. 

Proof, Let be all the given real roots and 

'kr+ii • • •» Sill th© nonreal ones. Then the 2m — r eigen¬ 
values 


X 


1? 



'r+l» ^r+l7 



k 


m 


of the operator are all distinct and therefore the sum 
® of the corresponding proper subspaces is their direct 
sum: 


© . . . © © * . . © {&Xm © &lj- 


Applying Re and taking into account the fact that 


C 








• > 


we obtain immediately equation (4). □ 

In the case where the operator is diagonalizable it 
follows from Proposition 1 that the space T' is decomposable 
as a direct sum of invariant (under A) spaces where X 
runs over all the real roots of the polynomial /a and all 
of its mutually nonconjugate nonreal roots. 

The restriction A^ = A of the operator A to a subspace 

with X real, is known, it is the scalar operator XE 
having a diagonalizable (scalar) matrix XE in an arbitrary 
basis. 

Consider now the operator A;^—-A with X nonreal. 

Let 


X = where a, p f 01 and p = 7 ^ 0. 


It was shown above that for any basis ei, . . ., of a sub¬ 
space the vectors 

(5) Re 01 , Im 01 , . . ., Re e^, Im e^ 


constitute a basis of a space Set, to simplify notation, 
0 = 01 , X = Re 0 , y = Im 0 and consider a two-dimen¬ 
sional subspace ^ cz with a basis x, y. 





Lecture 17 


177 


Since e 6 we have A^e = i.e. 


(x + iy) = (a + ip) (x + iy). 


This means by definition that 

Ax + iAy = (ax — py) + ^ (Px + ay), 

i.e. that 

Ax = ax — Py, 

Ay = Px + ay. 

Thus we see that a subspace is invariant under the opera¬ 
tor A and that the restriction of the operator A to 3^ is a matrix 


in the basis x, y. □ 

Since a space is a direct sum q of subspaces of the 
form (iP, we see that in the basis (5) the matrix of the oper¬ 
ator Ax is a block 


Pi I 

— Pt «1 I __ 

“2 P2i 

— P2 ^21 

0 




\ 


0 



1 

Q ^ 


with q = dim &,x\ matrices of the form (6) in the diagonal. 

Comparing all that which has been|proved we see that 
the following theorem is true. 

Theorem 1. For a linear operator A\ TF over the 

field 31, let the operator Ap : 5^^ be diagonalizable 

{this is in particular the case if the operator A has a simple 
spectrum). Also let Xi, . . ., Xj. be all^the reafand X^^i = 
= ai + ipi, . . .j Xm = oLjn^r + ^^m-r^be all^the nonreal 
characteristic roots of the operator A, mutually complex non- 
conjugate^ each of which is repeated as many times as is its 
multiplicity {so that m = 2n — r). 

12-01325 









178 


Semester ^ 


Then the space TT has a basis in which the matrix of the 
operator A is a direct sum of a certain number of first order 
matrices Kj. and second order matrices 

( / ^m-r 

V“Pi ccj’ \~-K-r 

i.e. is of the form 

iK \ 


X 


r 


0 




. □ 


0 




CCm-T 



m-r 


Pm - r 

^m-r 


Of course, a similar theorem, but with a more complicated 

matrix (7), holds also when the operator is nondiagonal- 
izable. We shall not need the theorem and therefore we 
shall neither prove it nor state it. 


i 








I 


Lecture 18 


Euclidean and unitary spaces •Orthogonal complements •The 
identification of vectors and covectors •Annulets and orthogonal 
complements •Bilinear functionals and linear operators • Elimi¬ 
nation of arbitrariness in the identification of tensors of dif¬ 
ferent types •The metric tensor •Lowering and raising of indices 


According to Definitions 2 and 3 of Lecture 13 in [1] a vector 
space over the field lH is said to be Euclidean if some 
positive definite symmetric bilinear functional is given in it. 
The functional is called a scalar multiplication and its 
value on vectors x and y, the scalar product of the vectors, 
is designated by xy or (x, y). 

A direct transfer of these concepts to the case of the ground 
field C is impossible, since there is no notion of positivity 
in C. One has to proceed in a more intricate way. 

Definition 1. A functional x, y 5 (x, y) given in a com¬ 
plex vector space is said to be sesquilinear if it is linear 
in the first argument, i.e. 

B (X 1 + X 2 , y) = 5 (xi, y) + 5 (xj, y) 

and 

B (cx, y) = cB (x, y) 

for any vectors Xi, X 2 , x, y 6 5^^ and any number c G €, 
and semilinear in the second argument, i.e. 

B (x, yi + y 2 ) = B (x, yi) + B (x, y^) 

and 

B (x, cy) = cB (x, y) 

for any vectors x, yi, y and any number c ^ . 


12* 



Semesler i 


iso 

A sesquilinear functional B is said to be Hermitian if 

B (y, x)=B (x, y) 
for any vectors x, y 6 5 ^* 

For a Hermitian functional, a number B (x, x) is real 
whatever the vector x G Therefore the question of its 
sign is meaningful. 

A Hermitian functional B is said to be positive definite if 

B (x, x) > 0 

for any nonzero vector x of a space 

A vector space TT over the field C is said to be unitary 
(as well as Hermitian) if some positive definite Hermitian 
sesquilinear functional is given in it. The functional is 
called a scalar multiplication and its value of vectors x 
and y, the scalar product of the vectors, is designated by 
xy or (x, y). 

An example of unitary space is the space with the scalar 
product 


(x, .. . + 

At its early stages the theory of unitary spaces closely 
resembles that of Euclidean spaces (see Lectures 13 and 14 
in [1]). Thus, for example, in unitary space the length 

I X I l/(x, x) of any vector x is defined, the Cauchy- 
Buniakowski inequality is correct, the concepts of orthogonal 
vectors ((x, y) = 0 ) and of orthonormal families of vectors 
and, in particular, of orthonormal bases make sense, the 
Bessel inequality holds (Proposition 2 of Lecture 14 in [1]; 
but one, naturally, has to write \Xi instead of xl), the 
analogue of Proposition 3 of Lecture 14 in ([1] on the prop¬ 
erties of orthonormal bases holds (but say Parseval’s 
formula is now of the form (x, y) = Xiyi -1- . . . + 
the Gram-Schmidt orthogonalization process is applicable 
and so on. Of course, identically formulated theorems have 
as a rule different geometrical meaning for Euclidean and 
unitary spaces. For example, for Euclidean spaces the fact 
that there exists an orthonormal basis means that any 
n-dimensional Euclidean space is isomorphic to the space 
with the multiplication (x, y) = Xiyi -f . . . -1- x^y^^ 









Lecture 18 


181 


while for unitary spaces it means that any ^-dimensional 
unitary space is isomorphic to the space with the mul¬ 
tiplication (x, y) == -f . . . + x^yrt^ 

In what follows we shall prove theorems for Euclidean 
and unitary spaces simultaneoijsly whenever possible. In 
contrast to 11 ] we shall now prefer to designate a scalar 
product by the symbol (x, y). 

Let S be an arbitrary subset of a Euclidean or unitary 
space 5^. 

Definition 2. The orthogonal complement 5^ of a subspace 
S is the set of all vectors of T orthogonal to each vector of S: 

= (x, y) = 0 , x65}. 

The properties of orthogonal complements are similar 
to those of annulets (see Lecture 4). It is clear, for example, 
that the orthogonal complement of any set is a subspace and 
that 5-L ID J-L if S cz T. The analogue of Proposition 3 of 
Lecture 4 also holds: 

Proposition 1. The orthogonal complement of an arbitrary 
set S coincides with the orthogonal complement of its linear 
span: 

Proof. Since S ci IS], we have 5“*'=) IS]'^. Conversely, if 
y G 5-L and x = -f . . . -f k^Xm, where Xj, . . x^ 6 
G S, then 

(x, y) = ki (xi, y) -f • • • + {xm^ y) — 0 
and hence y G □ 

Therefore we may consider without loss of generality 
only the orthogonal complements of subspaces. 

The analogue of Proposition 4 of Lecture 4, on the dimen¬ 
sion of an annulet, also holds for orthogonal complements. 
For these, however, a stronger statement is true, which is 
possible because for every subspace ^ cz TT itsjorthogonal 
complement is a subspace of the same space ‘T. 

Proposition 2. For any subspace ST cz the space TF is 
a direct sum of the subspace SP and its orthogonal complement: 

f^ 



182 


Semester 2 


Proof. Let ej, . . ej, be an orthonormal basis of a sub¬ 
space and let x G 5^* Denote the Fourier coefficients of 
the vector x with respect to the orthonormal family of 
vectors ei, Bp by OC j ^ ^ and compose a vector 

x' = XiBi + • • • + XpBp. Then, according to Proposition 2 
of Lecture 14 in [1], the vector x — x' will be orthogonal 
to all the vectors of ei, . . Bp and therefore (Proposi¬ 
tion 1) lie in the subspace 3 ^-^. Thus 

X x' + (x — x'), 

where Xi ^ and x — x' G This proves that f‘ = 

= ^ 

It remains to prove that ^ fl = 0. But this is ob¬ 
vious, since if x G ^ fl hence (x, y) = 0 for any 

y G then in particular (x, x) = 0 and consequently 

X - 0. □ 

It should not be surprising now that the analogue of 
Proposition 5 of Lecture 4 also holds. 

Proposition 3. For any subspace cz f' we have 

Proof. If X G then (x, y) 0 for any y G and 
therefore (x, y) = 0; this precisely means that x G 
Consequently, oP a Conversely, let x cz SP^^. Using 

Proposition 2 set x = x' + x", where x' G SP, x" G and 
therefore (x', x") = 0. Since x G we have (x, x") = 0 

and hence 

(x", x") = (x ^ x', x") = (x, x") - (x', x") = 0. 

Consequently, x" = 0 and therefore x = x cz Thus ^ = 

We stress that all this holds for boih Euclidean and uni¬ 
tary spaces. 

This parallelism between Euclidean and unitary spaces 
is violated for conjugate spaces. Therefore we have to con¬ 
sider the conjugate space f‘' separately for a Euclidean and 
a unitary space T^. 

First let 5^ be a Euclidean space. 








Lecture 18 


183 


Proposition 4. For any Euclidean space TT, there is a natu¬ 
ral isomorphism 

Proof. According to Proposition 2 of Lecture 4 it suffices 
to prove that the space T* is self-dual: 

r I r, 

i.e. that there exists a natural pairing of and But 
such a pairing does in fact exist, it is a scalar multiplica¬ 
tion (obviously nonsingular, by virtue of positivity). □ 
The isomorphism T* -^5^' is explicitly defined as a map¬ 
ping associating with every vector y G a linear functional 

lyl (x, y). 

It is clear that the correspondence y is a homomorph¬ 
ism. Since the spaces and 5^' have the same dimension, 
to prove that that homomorphism is an isomorphism, it is 
sufficient to establish that its kernel is zero, i.e. that if 
y ^ 0, then ^ 0. But this is obvious, since say (y) == 

= (y> y) ^ 0. 

Here we have in fact repeated the proof of Proposition 2 
of Lecture 5 for the case of the pairing x, y i—^ (x, y). 

On the face of it it seems that the proof remains valid 
for the case of unitary spaces too. But a closer look shows 
that for a unitary space TF the mapping y is not a homo¬ 

morphism. That is, although it does obviously carry a sum 
over into a sum, 

?2/i+i/ 2 = + lj/2? yij y2 65^» 

it carries a product by a number over into a product by 
a complex conjugate number, i.e. for any vector y G ^ 
and any number c G we have 

Mappings of vector spaces over the field C possessing these 
properties are called semilinear. It is easy to see that just 
as a linear mapping, a semilinear mapping of vector spaces 
of the same dimension, having a zero kernel, is bijective 



184 


Semester 2 


(is said to be a sernilinear isomorphism). All the arguments 
in the proof of Proposition 4 thus remain fully valid and 
hence the space conjugate to a unitary space f" is semi- 
linearly isomorphic to it,' 

This of course suits us little, since the primary impor¬ 
tance of Proposition 4 is that it allows identification of 
every Euclidean spaced with its conjugate space 5^' (without, 
in particular, distinguishing—even in notation!—between 
the vector y and the covector while the presence of only 
a sernilinear isomorphism in the unitary case permits such 
identification only with reservations. 

This can be remedied by understanding the covectors of 
5^' to be not linear but sernilinear functionals C, 

i.e. mappings such that 

i (x + y) = i (x) + ^ (y) 

and 

|(cx) = c (|x) 

for any vectors x, y 6 and any number c 6 C. This 
seems to be the trend, but at present this substitution is 
not at all generally accepted. 

Alternatively, we may define a new operation of multi¬ 
plication g of functionals ^ by numbers c g C in the 

space the linear functionals C assumed as 

before to be its vectors, putting 

{cl) (x)=|(cx) = c(| (x)) 

for any functional ^ 6 any number c G C, and any vec¬ 
tor X 

Of course, thus we simply transfer the “parasite” complex 
conjugation to other, possibly less conspicuous, places. 
Since both variants have their advantages and drawbacks and 
neither has yet become prevalent, each requiring a revision 
of all the previous material (say, of tensor theory), we shall 
give up both, stick to the former point of view, and shall 
not aim at formal perfection. 

As far as the identification of vectors and covectors is 
concerned, we shall allow it to be made in the unitary case 









Lecture 18 


185 


too, remembering all the time the possibility of the boring 
complex conjugation appearing. 

The fact that in Euclidean and unitary spaces vectors 
and covectors practically coincide allows identification of 
objects fundamentally different in arbitrary vector spaces. 

For example, it is easy to see that when covectors are 
identified with vectors the annulet S° of an arbitrary set 
S d T' coincides with the orthogonal complement iS^-. 
Indeed, the inclusion \ ^ S'" implies that £ (x) = 0 for any 
X 6 <5. Therefore, if we identify a covector ^ and a vector 
y G satisfying the relation \ (x) = (x, y) for any vector 
X 6 then, in particular, for any vector x ^ S we shall 
have the equation (x, y) = 0 implying that x 6 □ 

This explains the above parallelism between annulets and 
orthogonal complements. 

The coincidence of vectors and covectors in Euclidean 
space leads to the most pronounced simplifications in tensor 
theory, allowing identification of different (p, g)-tensors 
with the same sum p-{~ q, since it is possible to declare 
each argument of a tensor to be a vector or covector at will. 

Consider, for example, a (2, 0)-tensor, i.e. a bilinear 
functional A: x, y t—> ^ (x, y). Assuming its second ar¬ 
gument y to be a covector, we obtain from it a bilinear 
(1, l)-functional, i.e. a ^'linear operator A: x Ax. It is 
easy to entangle oneself in identifications here. So be atten¬ 
tive: an operator A transforms a vector x into a vector Ax 
such that, if considered as a functional on covectors, it has 
on a covector ^ the value ? (Ax) = A (x, y), where y is 
the vector identified with the co vector 5. But the identifica¬ 
tion ^ = y implies that | (z) = (z, y) for any vector z £ ^ 
and in particular that ^ (Ax) = (Ax, y). Thus 

(1) ^ (x, y) = (Ax, y). 

Formula (1) explicitly describes the bijective correspond¬ 
ence between linear operators A: x Ax and bilinear 
functionals A: x, y A (x, y) in a Euclidean spaced. 
Irrespective of the general theory'^it could be accepted as 
a definition of that correspondence. Then it is necessary to 
establish that for any linear operator A the functional A 




186 


Semester 2 


defined by ( 1 ) is bilinear (this reduces to an automatic check), 
that the resulting “operator” “functional” correspondence 
is a homomorphism of the corresponding vector spaces 
(another automatic check), that that homomorphism is 
injective (put y = Ax and take advantage of the nonsin¬ 
gularity of scalar multiplication) and, finally, that that 
homomorphism is an isomorphism (it follows from its being 
an injection, for both vector spaces have the same dimen¬ 
sion n^). 

The last approach is suitable for unitary spaces as well, 
but sesquilinear functionals will obviously result instead of 
bilinear functionals. In order to avoid making such reserva¬ 
tions we confine ourselves (in this lecture) to Euclidean 
spaces; the reader can no doubt make all changes involved in 
switching to unitary spaces on his own. 

An attentive reader must have already noticed that there 
is an element of arbitrariness in the identification of bilin¬ 
ear functionals^ and linear operators described above. In¬ 
deed, we take the second argument of a bilinear functional 
A (x, y) as a covector, but we could be equally well justified 
(in Euclidean space) in assuming the first argument to be 
a covector. Then, generally speaking, a different linear 
operator A* would result for which there would hold the 
formula 

(2) A (x, y) = (x, A*y). 

The situation is similar, and still worse, for tensors 
of other types. Consider, for example, a (3, l)-tensor 
T (xi, X 2 , X 3 ; ^ 1 ). By declaring the vector X 3 to be a covector 
(and denoting it by, say, ^ 2 ) we identify that tensor with 
a (2,2)-tensor T (xi, X 2 ; ? 2 ’ ii)* But we may assume the 
new covector argument to be not the first but the second 
argument, and then, in general, a different ( 2 , 2 )-tensor 
will result. Moreover, assuming the vector X 2 , rather than X 3 , 
to be a covector, we may obtain another ( 2 , 2 )-tensor dis¬ 
tinct from the first two. We may, for example, declare the 
argument Xi to be a covector and simultaneously consider 
the argument to be a vector! Then a (3, l)-tensor results, 
of the same type as the original one but distinct from it, 
and so on and so forth. 





Lecture 18 


187 


For definiteness we should introduce a single enumera¬ 
tion (or at least a single ordering) of vector and covector 
arguments and write them alternately in that order. Thus, 
for example, the symbol T (xi, X 2 , S 3 ? X 4 ) for a (3, l)-ten- 
sor means that when the covector Ig is declared to be a vec¬ 
tor, a (4, 0)-tensor results in which the new vector argu¬ 
ment is the third, and when the vector X 2 (X 4 ) is declared to 
be a covector, a ( 2 , 2 )-tensor results in which the new covec¬ 
tor argument is the first (the second), among the covector 
arguments. 

In order to avoid misunderstanding we stress that the 
symbols T (xi, X 2 , Xg, S 4 ) and T (xi, X 2 , S 3 ? X 4 ) designate 
both a ( 3 , 1 )-tensor with three vector and one covector 
arguments. These tensors differ only in their origin, the 
first of them having been obtained from some (4, 0)-tensor 
T (xi, X 2 , Xg, X 4 ) by giving the name of covector to the 
argument X4 and the second by declaring the argument Xg 
to be a covector. Distinguishing between tensors of the 
form T (xi, X2, S3? X4) and those of the form T (xi, X2, Xg, 
^ 4 ) makes no sense in arbitrary vector spaces. 


Let 61 , . . ., be an arbitrary basis of a Euclidean 
space Then, by virtue of the identification = 
the conjugate basis e^ . . ., is also a basis of the space 
but, in general, one distinct from the basis ei, . . ., e„. 
It is connected with the basis e^, . . ., by the relations 

(e^, e-^) = 6 i, i, 7 = 1 , ..., 


If 




i, 7 = 1? 


n, 


are the formulas for the change from the basis e^, . . ., 
to the basis ei, . . ., e„, then 

(®i? ^j) gik{^ ? ^j)~gik^d~gih 


and we see that the numbers gij are the familiar metric 
coefficients of the basis ej, . . ., e„ (see Lecture 14 in [1]). 
They constitute a nonsingular matrix whose inverse is a 
matrix with the elements 




188 


Semester 2 


If we change to a basis 

then the metric coefficients of the new basis are ex¬ 
pressed by the formulas 

• • 

i.e. are transformed by tensor law. This means that the 
numbers gij are the coefficients of some tensor 

G == gifi^ ® = g^^et (g) ey 

called a metric tensor of a Euclidean space 5^. The value 
G (x, y) of the tensor on vectors x, y 6 is just the scalar 
product of the vectors: 

y) = gijxY^ = (x, y). 

Thus the term metric tensor has exactly the same meaning 
as the term scalar multiplication! 

Now let x be anf arbitrary vector of the space By 
definition its tensor product x ® G with a metric tensor G 
is a (2, l)-tensor. This can be contracted (see Lecture 6) 
over the only superscript and over one, say for definiteness 
the second, subscript (although this is of no importance in 
the given case). As a result we obtain some (1, 0)-tensor, 
i.e. a covector The value S (y) of the covector on an ar¬ 
bitrary vector y is equal to the contraction tr (^ ® y) of 
a tensor product i ® y and hence to the result of the com¬ 
plete contraction ® G ® y, i.e. (see the'^examples of 
contraction in Lecture 6) to the value G (x, y) == (x, y) 
of the tensor G on the vectors x and y. Since the equation 
5 (y) (x? y)"^ means by definition that the covector g is 

identified with the vector x, this proves that the vector x, 
regarded as a covector^ is a contraction of the tensor x ® G 
or in common parlance is a contraction of the vector x with 
the tensor G. □ 

In a basis ej, . . ., e„ a tensor x ® G has the coordinates 
gijX^ and its contraction the coordinates 








Leciure Id 


m 

The numbers Xi, , , are called the covariant coordinates 
of the vector x in the basis e^, . . By definition they 

are the coordinates of the corresponding covector | in the 
conjugate basis e^, . . or, equivalently, the coor¬ 
dinates of the vector x in thej basis whose ele¬ 

ments are identified with^the vectors of . the space The 
“actual” coordinates tie ^ ^ 3C of the vector x in the 

basis 01 , . . ., are called the contravariant coordinates 
of the vector, to distinguish them from the covariant coor¬ 
dinates. 

A change from the coordinates x^ to the coordinates xi 
is sometimes called the lowering of the index i and the inverse 
operation is called the raising of the index. 

According to a single ordering of arguments of an arbit¬ 
rary tensor (see above), the superscripts and subscripts of its 
coordinates (components) must also be ordered. Therefore, if 
there are superscripts and subscripts it is necessary to 
leave gaps above for places occupied by subscripts and con¬ 
versely gaps below for places occupied by superscripts. 
Dots are sometimes put in the gaps for clearness. 

Thus, for example, the coordinates of a tensor T (xi, X 2 , 
X 3 , ii) are designated 

fp • • • ii_ H 

and the coordinates of a tensor T (xi, X 2 , X 3 ) as 

• _Y* h 

ilia - ^ iiiiig* 

In particular, there are two symbols for the coordinates 

« • 

of a linear operator: ai and aj, the first when the operator is 
obtained from a bilinear functional with the coordinates 
aij by declaring the second argument to be a covector 
(formula ( 1 )) and the second by declaring the first argument 
to be a covector (formula (2)). Since 

tt^y=”A(e^, Gj*)? fli=(A6^, aj (^i? A*e^), 
we have 

( 3 ) aij = 8iha-1 = gh}a\, 

and also 

(4) 




190 


Semester i 


The numbers a] (as well as the numbers == 

^ may be regarded as different coordinates of 

the same mathematical object that, just as a particle in 
quantum mechanics, has two faces, a “functional” and an 
“operator” one. The coordinates Uij are called the coordinates 

covariant over both indices, the coordinates al are called 
the coordinates covariant over the first index and contra- 
variant over the second and so on. 

As shown by (3) and (4), all these coordinates are obtained 
from one another by tensor multiplication by “reciprocal” 
tensors gtj and followed by contractions over the corre¬ 
sponding indices. 

The lowering and raising of indices can be effected in 
a similar manner for other tensors as well. For example. 


TiX = 


h h 
hh • 


If a basis ei, . . ., is orthonormal, then the conjugate 
basis e^, . . ., coincides with it and all formulas for 
the lowering and raising of indices simply turn into the 
equations of the corresponding coordinates (having the same 
indices regardless of their position). For example, 

(5) aij = ai^ = ai^ 

for bilinear functionals and 

^ 


for vectors. That is why even in the first semester’s lectures 
we used symbols with subscripts for the coordinates of 
vectors in an orthonormal basis. 

Note that according to the first of the formulas (5) a bi¬ 
linear functional A and the linear operator A corresponding 
to it according to (1) have the same matrix in every ortho¬ 
normal basis. 

As to the operator A* defined by (2), its matrix (in an 
orthonormal basis) is the transpose of the operator A. 

In what follows we shall always identify bilinear func¬ 
tionals and linear operators by (1), so we shall not need 

the explicit notation a] for the elements of the matrix of 
the linear operator. Therefore we shall continue to designate 

these elements as a]. 








Lecture 19 


Adjoint operators • Self-adjoint operators • Skew-symmetric 
and skew-Hermitian operators •Analogy between Hermit- 
ian operators and real numbers •Spectral properties of self- 
adjoint operators• The orthogonal diagonalizability of self- 
adjoint operators 


According to formulas (1) and (2) of the preceding lecture 
we may associate with every linear operator A: T 
acting in a Euclidean or unitary space ? a bilinear func¬ 
tional A and associate with the latter a linear operator A*. 

Definition 1. The operator A* is called the operator 
adjoint to the operator A. It is uniquely characterized by 
the relation 

(1) (Ax, y) = (x, A*y) 

which must hold for any vectors x, y G ^ • 

This definition has meaning for a unitary space as 
well, but while for a Euclidean space 5^ the operator A* 
is none other but an adjoint operator A': T' ' regarded, 
by virtue of the identification T' = as an operator on 
for a unitary space T the operator A* differs from the 
operator A', even after the identification of vectors and 
covectors, in that it is complex conjugate. 

In an arbitrary basis ei, . . of a Euclidean space 

the elements a^^ of the matrix of the operator A* are re¬ 
lated to the elements a] of the matrix of the operator A 
by the formula 


S gjl^k' 







192 


Semester 2 


For an orthonormal basis Oi, . . this formula takes 
the form 


In a unitary space 5^ the corresponding formula (in an 
orthonormal basis) is of the form 

j 

aj = Ui- 

Thus an operator A* on a Euclidean {unitary) space Y* is 
adjoint to an operator A if and only if in some {and hence 
in any) orthonormal basis its matrix is the transposed {respec¬ 
tively transposed and complex conjugate) matrix of the oper¬ 
ator A. □ 

For the Euclidean case this statement can be proved 
without any calculations, if one recalls that operators A 
and A' have transposed matrices in conjugate bases (see 
Lecture 13) and that a basis ei, . . is orthonormal if 
and only if it coincides with the conjugate basis e^, . . e” 
regarded as a basis in T. 

The properties of the adjoint operator A* are naturally 
quite similar to those of the adjoint operator A'. For ex¬ 
ample, A** = A and (AB)* = B*A*. The only essential 
difference arises as always in unitary spaces in connection 
with multiplication by numbers. Namely, if (cA)* = cA* 
in a Euclidean space, then there again arises a “parasite” 

complex conjugation in the unitary case: (^rA)* = ^rA*. 

The following definition essentially uses the fact that 
operators A and A* act in the same space (and hence does 
not apply to an operator A'). 

Definition 2. An operator k.\ Y' on a Euclidean 

or unitary space is said to be self-adjoint if A* = A, i.e. if 
for any vectors x, y 6 ^ we have 

(Ax, y) = (x, Ay). 

Self-adjoint operators are also called symmetric (or sym¬ 
metrical) operators in the Euclidean case and Hermitian 
operators in the unitary case. 

It is clear that an operator K on a Euclidean {unitary) 
space is symmetric {Hermitian) if and only if the corresponding 
bilinear {sesquilinear) functional A is symmetric {Hermitian). 







Lecture 19 


193 


Por example, in a unitary space 

A (y, x) =- (Ay, x) = (x, Ay) = (Ax, y) = A (x, y). □ 

A sum of self-adjoint operators and a product of a self- 
adjoint operator by a real number are obviously self-adjoint 
operators. This means that self-adjoint operators form a uec-^ 
tor space over the field !R (that is a subspace of the space 
Op iff) in the case of a Euclidean space T)- 

Note that a product of two self-adjoint operators may 
or may not be a self-adjoint operator. More precisely, a prod¬ 
uct AB of two self-adjoint operators A and B is a self-adjoint 
operator if and only if the operators commute, i,e, AB = BA. 

Indeed, if AB = BA, then (AB)* = (BA)* = A*B* = 
= AB. Conversely, if (AB)* = AB, then BA = B*A* = 
= (AB)* = AB. □ 

A quadratic matrix A = (a^) consisting of complex num¬ 
bers is said to be Hermitian if after transposing it coin¬ 
cides with the complex conjugate matrix, i.e. if 

a{ = a) for any i, 7 = 1, ..., n. 

It is clear that in a Euclidean {unitary) space an operator A 
is symmetric {Hermitian) if and only if in some {and hence 
also in any) orthonormal basis its matrix is symmetric {Her¬ 
mitian). □ 

Definition 3. An operator A on a Euclidean space T" is 
said to be skew-symmetric if A* = —A, i.e. if 

(Ax, y) + (x, Ay) = 0 
for any vectors x, y 6 5^. 

Similarly, an operator A on a unitary space is said 
to be skew-Hermitian if A* = —A. 

Skew-symmetric operators constitute a quite independent 
class of linear operators. Skew-symmetric bilinear function¬ 
als correspond to them and in coordinates they are character¬ 
ized by the fact that their matrices are skew-symmetric 
in every orthonormal basis. They form a subspace in the 
space Op (5^), the space Op (5^) of all linear operators (cf. Pro¬ 
position] 1 of Lecture 11) being decomposable as a direct 
sum of the subspaces of symmetric and skew-symmetric 


13-01325 


194 


f 

Semester 2 


operators, i.e. any linear operator A can be represented 
by the sum 

(2) -A. — Agynini-f-Ag|^Pv\' 

of a symmetric operator Agymjn and a skew-symmetric 
operator Ag^ew? where 

^ _ A + A* ^ A — A* 

•^symm 2 ’ -^skew— 2 • 

For Hermitian operators the situation is quite different, 
since skew-Hermitian operators can be reduced in a trivial 
way to Hermitian operators, a fact having no analogues in 
Euclidean space. Namely, it follows immediately from the 

relation (rA)* = iA* = —rA* that an operator is skew-Herm¬ 
itian if and only if it has the form iA, where A is a Hermitian 
operator, □ 

At the same time the analogue of decomposition (2) obvious¬ 
ly remains valid for operators on unitary space. Therefore 
any operator X on a unitary space can be uniquely represent¬ 
ed as 

A = B -j- rC, 

where B and C are Hermitian operators. This means (see 
Definition 1 of Lecture 19 in [1]) that for any unitary space T* 
the vector space Op {T^) carries the natural structure of a real- 
complex vector space, the corresponding real subspace being 
the space of Hermitian operators, □ 

We thus see that in a certain respect Hermitian operators 
are similar to real numbers. This similarity can be traced 
in other respects too. 

According to Definition 1 of Lecture 11 and the relation, 
established above for Euclidean spaces, between symmetric 
bilinear functionals and symmetric linear operators, in 
Euclidean space every quadratic functional can be uniquely 
represented as (Ax, x), where A is some symmetric linear oper¬ 
ator, Functionals of this form present nothing new for non- 
symmetric linear operators, since (Ax, x) = 0 for all x 6 
if (and only if) the operator A is skew-symmetric. 

For unitary spaces the situation turns out to be funda¬ 
mentally different. This is not surprising, however, for in 
a unitary space no functional of the form (Ax, x), with 









Lecture 


195 


A 0, is a quadratic functional in the sense of Definition 1 
of Lecture 11 and therefore there are no reasons for the 
properties of such functionals to resemble those of quadratic 
functionals. 

In a Euclidean space a functional (Ax, x) could be identic¬ 
ally zero without the operator A being zero. In a unitary 
space this is not possible. 

Proposition 1. If a linear operator A.: TT on a unitary 
space T* possesses the property that 

(3) (Ax, x) = 0 

for any vector x 6 then A = 0. 

Proof, Since for any vectors x, y 6 ^ we have 

(A(x + y), x + y) = (Ax, x) + (Ax, y) + (Ay, x) + (Ay, y), 

(A(x + iy), x + iy) = 

= (Ax, x) + (Ax, jy) + (iAy, x) + (jAy, iy) 

and 

(Ax, iy) = —i (Ax, y), (iAy, x) = i (Ax, y), 
in view of (3) 

(Ax, y) + (Ay, x) = 0, 

(Ax, y) — (Ay, x) = 0, 

and hence (Ax, y) = 0. Putting here y = Ax, we have 
(Ax, Ax) = 0. Therefore Ax = 0 for any . □ 

Proposition 2 (Hermitian property criterion). A linear 
operator A on a unitary space TT is Hermitian if and only if 
for any vector x the number (Ax, x) is real. 

Proof, If the operator A is Hermitian, then for any vec¬ 
tor x 6 _ 

(Ax, x) = (x. Ax) = (Ax, x) 

and hence the number (Ax, x) is real. Conversely, if (Ax, x)= 
= (Ax, x), then 

((A —A*)x, x) = (Ax, x) — (A*x, x) = 

= (Ax, x) — (x, Ax) == 

= (Ax, x) — (Ax, x) = 0, 

and, therefore, according to Proposition 1, A — A* = 0. □ 


13 * 


196 


Semester 2 


The following propositions are true for both Euclidean 
and unitary spaces (although, in general, each requiring 
a different proof). 

Proposition 3 (reality). All characteristic roots of an ar¬ 
bitrary self-adjoint operator are real. 

Proof. Let A be a self-adjoint operator in a Euclidean or 
unitary space T and let X be its arbitrary characteristic 
root. 

If the space T* is unitary (and hence the operator A is 
Hermitian), then the number X is an eigenvalue of the oper¬ 
ator A, i.e. there exists a vector Xq 0 such that Axq = 
= Xxq. For that vector (Axq, Xq) = (Xxq, Xq) = X (xq, Xq) 
and hence 

. _ (AXq, Xq) 

(Xo» Xo) 

To complete the proof of Proposition 3 in this case, it 
remains to note that according to Proposition 2 the right- 
hand side of this formula is real. Therefore, so is the num¬ 
ber X. 

Now let be a Euclidean space. Arguing by contradic¬ 
tion, assume that A, = a + ip, where P = 7 «^ 0. Then, as was 
shown in Lecture 16, for an operator A there exists a two- 
dimensional invariant subspace in the space and there 
is a basis x, y in < 9 ^ such that 

Ax = ax — Py, 

Ay = Px’^+ ay. 

Therefore 

(Ax, y) = (ax — Py, y) = a (x, y) — P (y, y) 
and 

(x, Ay) = (x, Px + ay) = p (x, x) + a (x, y). 

Since the operator A is self-adjoint (symmetric) and hence 
(Ax, y) = (x. Ay) it follows that 

p [(x, x) + (y, y)] = 0. 

Since this last equation is impossible (for (x, x) > 0, 
(y» y) > 0 by hypothesis p = 7 ^ 0 ) this proves that 

XfK. □ 





Lecture 19 


197 


Proposition 4 (orthogonality). Any two eigenvectors x and 
y of a self-adfoint operator A belonging to different eigenvalues 
X and |ui are orthogonal. 

Proof, We have 

(Ax, y) = (Xx, y) = ^ (x, y), 

(x, Ay) = (x, [j.y) = fi (x, y) 

(the last equation is true in a unitary space as well, since 
according to Proposition 3 the number [x is real). Therefore, 
by virtue of self-adjointness, 

^ (x. y) = [A (x, y), 

which is possible for A, = 7 ^ [x only when (x, y) = 0. □ 

Proposition 5 (on the orthogonal complement). For any 
self-adjoint operator A, the orthogonal complement of 
an arbitrary invariant subspace ^ is also an invariant subspace. 
Proof, If X 6 then (x, y) = 0 for all y 6 <9^ 
therefore (Ax, y) = (x, Ay) = 0, since by hypothesis 
Ay 6 Hence Ax 6 □ 

Proposition 6 (on multiplicities). The geometric multiplic¬ 
ity px^ of an arbitrary eigenvalue Xq ^ self-adjoint operator 
A equals its algebraic multiplicity 

Proof, Let ^ proper subspace belonging to an 

eigenvalue %q and let ei, . . ., be an orthonormal basis 
of a space T such that = [ei, . . ., Cp] (and therefore 

such’ that = [cp+i, . . e,i]), where p = Since, 

according to Proposition 5, 

in that basis the operator A has a matrix of the form 






198 


Semester 2 


where B is the matrix of an operator B = A | . Hence 

O' A.0 

/a (^) = (^0 — W /b and therefore if ^hen 

/b (^o) = 0 and so Kq is an eigenvalue of the operator B. 

The corresponding eigenvector in is an eigenvector 
of the operator A belonging to the eigenvalue which is 
impossible since all these vectors lie in Consequently, 
Pxo > aiidj hence (since always pxo < 

see Lecture 14). □ 

Remark. In the proof of Proposition 6 we used only the 
property of a self-adjoint operator, that the orthogonal 
complement of each of its subspaces is an invariant subspace 
(so we did not even need to fully use Proposition 5). There¬ 
fore Proposition 6 is true for any operator for which the orthog¬ 
onal complement of every proper subspace is invariant, □ 

According to Theorem 1 of Lecture 16, it follows from 
Proposition 6 (together with Proposition 3, for Euclidean 
spaces) that an operator A is diagonalizable, i.e. 

© ... © 

where Ij, . . ., 'kja are all possible eigenvalues of that 
operator. By choosing an orthonormal basis in each of the 
subspaces we obtain, in view of Proposition 4, an 

orthonormal basis of a space ‘T in which the operator A 
has a diagonal matrix. 

Definition 4. An operator A in a Euclidean or unitary 
space f" is said to be orthogonally diagonalizable if in the 
space there exists an orthonormal basis in which the 
matrix of the operator A is diagonal (i.e. which consists of- 
eigenvectors of that operator). 

We thus see that we have proved the following theorem. 

Theorem \. In a Euclidean or unitary space, any self-adjoint 
operator is orthogonally diagonalizable. □ 








Lecture 20 


Bringing quadratic forms into canonical form by orthogonal 
transformation of variables •Second degree hypersurfaces in 
a Euclidean point space •The minimax property of eigenvalues 
of self-adjoint operators •Orthogonally diagonalizable operators 


Theorem 1 of the preceding lecture states that in every 
Euclidean space any self-adjoint operator is orthogonally 
diagonalizable. We reformulate the theorem in terms of 
symmetric bilinear (or, equivalently, quadratic) forms. 
Let 


(1) Q (^i, • • • , 

be an arbitrary quadratic form in n variables Xi, . . ., 
with real coefficients gi;, / = 1, . . ., w. 

On choosing in an w-dimensional Euclidean space T" an 
orthonormal basis 01 , . . ., e^we may consider in T* sl qua¬ 
dratic functional Q expressed in that basis as Q {xi, . . ., x^) 
and hence the corresponding symmetric linear operator Q: 
^ “vT" (i.e. such that Q (x) = (Qx, x) for any vector 
X ^T). According to Theorem 1 of the preceding lecture, 
in the space there exists an orthonormal basis fj, . . ., 
in which the operator Q has a diagonal matrix with diag¬ 
onal elements Xi, . . ., This implies that for any vector 
X 6 we have 


where 


Q (^) — + • • • + ^nUn^ 


Vi — ^ 11 ^ 1 * 4 “ * • • 


(2) 


yn — "h • • • "4' ^r\n^n 





200 


Semester 2 


are the coordinates of the vector x in the basis fi, . . .,f^. 
Since both bases Ci, . . and fi, . . are orthonor¬ 
mal, transformation (2) is orthogonal, i.e. the matrix C 
of its coefficients is an orthogonal matrix (see Lecture 14 
in [1]). This proves the following theorem. 

Theorem 1. Any quadratic form (1) can he reduced by the 
orthogonal transformation of the variables to the form 

(3) + • • • + ^nUn* 

The coefficients ^i, . . ., are the roots of the equation 

det {Q — 

and are therefore uniquely determined {up to an order). □ 
The theorem formally differs from the (substantially 
simpler) Lagrange theorem of Lecture 11 only in that bring¬ 
ing into the canonical form (3) is achieved not by an arbitrary, 
but by the orthogonal transformation of the variable (2). 
That is why the canonical form (3) proves to be unique. 

Just as the Lagrange theorem allowed us to give a clas¬ 
sification of second degree hypersurfaces of an n-dimensional 
affine space (see Lecture 12), so Theorem 1 leads to a similar 
classification of second degree hypersurfaces in an n-dimen- 
sional Euclidean point (real-complex) space. Indeed, repeat¬ 
ing word for word the proof of Theorem 5 in Lecture 13 and 
only referring to Theorem 1 instead of the Lagrange theorem 
we obtain immediately the following theorem. 

Theorem [2 (bringing the equations^of second degree hyper- 
surfaees in an n-dimensional Euclidean space into cano¬ 
nical form) JFor any second degree hypersurface in an n-dimen- 
sional (n ^ 1) real-complex Euclidean space there exists a sys¬ 
tem of rectangular coordinates Xi, x^ in which its equation 
has either the form 

(I) 'k^xl -f ... 'k^xl ■= 8, 

where 1 ^ r ^ w and 8 = 0 or 1, or {which is possible only 
when n > 1) the form 

(II) "kixl + ... -f 'kT.xl = 2Xr+i, 

where \ ^ r ^ n — 1, with 7.^ 0, {. . ., =7^ 0 in both 

cases. □ 




Lecture 20 


201 


In order to uniquely fix the coefficients (which, 

we note, are proportional to the nonzero roots of the corre¬ 
sponding characteristic polynomial, repeated as many times 
as is their multiplicity) one should first order them in a rea¬ 
sonable way (i.e. interchange appropriately the coordinates 
. . ., x^). We require that first the positive coefficients 
should be transferred and then the negative ones. Besides, 
in either group the coefficients should be arranged in the 
order of increasing absolute values. Thus, if p, 0 ^ p ^ r, 
is the number of positive coefficients, then we assume that 

0 < . . . < 

and 


0 <C I ^-p+l I ^ I ^p+2 I ^ ^ I I • 

We can in addition get 

(4) 

for 8 = 0 in case (I) by multiplying by —1. We can obtain 
the same result also in case (II) by changing, if necessary, 
the sign of the coordinate Therefore, for the purpose 

of uniformity, we shall assume in case (I) the value e = —1, 
satisfying in this way condition (4). 

Finally, we shall suppose in case (I) for 8 = 0 that 

I ?vi 1 . . . + I 1 = !• 

Equations (I) and (II) satisfying these conditions will 
be called the Euclidean canonical equations of second degree 
hypersurfaces. 

For n — 2 and w = 3 we obviously obtain (up to nota¬ 
tion) the canonical equations of second degree curves in 
the plane and of second degree surfaces in three-dimensional 
space, enumerated in Lectures 22 and 23 of [1]. 

Bringing the equations of hypersurfaces into canonical 
form by the method employed in proving Theorem 2 (i.e. by 
the method of Lecture 13 making use of Theorem 1 instead 
of the Lagrange theorem) we shall all the time obtain, as 
can easily be seen, the same canonical equation (although, 
possibly, in different systems of coordinates). Although 
t^his does not prove yet that there are no coordina^ies in 

I ^ 




202 


Semester 2 


which one obtains a different canonical equation, never¬ 
theless it is so: 

Theorem 3 (classification of second degree hypersurfaces 
of an /^-dimensional real complex Euclidean space). Two 

second degree hypersurfaces in an n-dimensional real-complex 
Euclidean space are Euclidean equivalents if and only if they 
have the same canonical equations. 

We know, from the example of second degree curves in 
the plane (see Lecture 22 in [1]), how to proceed in proving 
this theorem. The method is to characterize the coefficients 
. . ., Xr geometrically regardless of coordinates. To 
clarify the idea of the general method, let us consider in the 
plane an ellipse 



62 



where a'^b ^in this case, Ki ^2 

y 2 

The left-hand side ^ ^ quadratic form in the 

coordinates x, y of the points of the plane. If we consider 
this quadratic form only for x^ y^ = I (on a “unit circle”), 

then, as can easily be seen, its maximum is ^2 its 

1 

minimum is A,, — -r. 

In the case of the ellipsoid 


Iy2 ^2 

— 4 --^ 4 - — = 1 
^2 4- ^2 4 - ^2 



the coefficient is similarly equal to the maximum of the * 
quadratic form ^ +15 + ^ on a “unit sphere” + 

CL O C 

I 

-f 2 ^ = 1, and the coefficient ^ is equal to the minimum. | 

The “middle” coefficient ^ is more difficult to characterize. 

To this end consider all possible sections of an ellipsoid by 
planes passing through its centre. These sections are ellipses 
and the corresponding coefficients 7^2 ^ defined 

for them. These coefficients are of course dependent on the 






Lecture 20 


203 


choice of the plane and, as can easily be seen, the lowest 
possible value of the largest coefficient is just equal to ^. 

It turns out that a similar geometrical characterization 
of the coefficients is possible in the general case 

as well. This is based on the corresponding statement about 
the eigenvalues of operators and we shall restrict ourselves 
to the proof of that statement. The transition to the coef¬ 
ficients of the equations of hypersurfaces is quite trivial, 
but we have no time to spare. 


So we again return to the Euclidean vector space and 
the symmetric operator A given in it. We may assume, 
however, without any changes in the formulations and proof 
that the space is unitary and the operator A is Hermitian. 

In both cases"(see Proposition 3 of the preceding lecture) 
all eigenvalues (= characteristic roots) of the operator A 
are real. By repeating each of them as many times as is its 
multiplicity (and hence obtaining precisely n of them) we 
number the eigenvalues in decreasing order: 



Our aim is to find a direct “geometrical” description of these 
numbers. 

Let ^ be an arbitrary subspace of the space T and S = 
=75 (#), its subset (“unit sphere”) consisting of all vectors 
X 6 ^ for which (x, x) = 1. 

Since for any vector x ^ S the number (Ax, x) is real 
(when T' is Euclidean, this is self-evident, and when T is 
unitary, it is ensured by Proposition 2 of the preceding 
lecture), the number 

a ((9^) = sup {(Ax, x); xg i?^, (x, x)=l} 


is defined (instead of sup one may write max, however, 
since the sphere 5 (eP) is compact). 

Proposition 1. For any ^ = 1, . . ., w, we have 

kq = inf (a ((3^); dim (9^=n — g-hl}, 


where inf is taken over all subspaces 9^ cz f of dimension 
n — q 



204 


Semester 2 


Proof, In the space T", according to Theorem 1 of the 
preceding lecture, there exists an orthonormal basis ei, . . . 
. . such that 

Ae^ = for any ^ = 1, . . n. 

Let Sl^q = [ei, . . e^l and let 0^ be an arbitrary sub¬ 
space of dimension w — g + 1. Since 

dim SS^q 4- dim ^ = q-\-{n — q-\~\)=n-{-\':>n^ 

we have, according to Theorem 1 of Lecture 1, Sf^q [] ^ 

^ 0, i.e. there exists a nonzero vector x 6 fl We 
may assume without loss of generality that (x, x) = 1. 
Since x 6 we have a{§f^) ^ (Ax, x), and since x ^ Sf^q 
and hence x = Xie^ + • • • + we have 

(Ax, x) = (X/jiT^e^ • • • H“ “h • • • “h — 

— I • • • + I (I P+ • • • H" 1 ^g P) = 

= A,g(x, X)=>.g. 

Thus a (c^) ^ Xq for any subspace ^ of dimension n — q + 
+ 1 and hence 

inf {a((f?^); dim '3^ =n^q-\- i}^Xq. 

On the other hand, since for any vector x = XqBq + . . . 
. . . -f Xnen of the subspace = [e^, . . e^l of dimen¬ 
sion n — q + I there is an inequality 

(Ax, x) = 1 p + .. . 

<^g(l ^g P + •• H- i 'p) = (x, x) = ?v^. 


we have 

(^(g)XAg, 

and hence 

inf{a(t?^); diml^=w — q^}- i}^kq. □ 

The property of the eigenvalues of self-adjoint operators 
we have proved is called the minimax property of eigenvalues. 
The proof of Theorem 3 is now obvious. We leave it to 
the reader to give the details. 




Lecture 20 


205 


In a Euclidean space every orthogonally diagonalizable 
operator, having in some orthonormal basis a diagonal, and 
hence symmetric, matrix, is symmetric (self-adjoint). This 
proves the following theorem. 

Theorem 4. In '^a Euclidean space a linear operator is 
orthogonally diagonalizable if and only if it is symmetric, □ 

In a unitary space, however, self-adjoint (Hermitian) 
operators make only a part of all orthogonally diagonalizable 
operators, since in a Hermitian matrix all diagonal elements 
must be real. Therefore an operator having in some ortho- 
normal basis a diagonal matrix at least one of whose ele¬ 
ments is nonreal is orthogonally diagonalizable but not 
Hermitian. 

Definition 1. An operator A in a unitary (or Euclidean) 
space is said to be normal^ if it is commutative with the 
adjoint operator A*. 

Recall (see the preceding lecture) that in a unitary space 
any operator A can be uniquely represented as 

A = B -f iC, 

where B and C are Hermitian operators. 

Proposition 2. In a unitary space an operator A = B + jC 
is normal if and only if the operators B and C are commutative 
(BC = CB). 

Proof. Since 

A* = B* + (iC)* = B* — iC* = B — iC, 
we have 

AA* = (B + iC) (B - iC) + i (CB - BC) 

and 

A*A = (B — iC) (B + iC) =B^ + — i (CB ~ BC). 

Therefore AA* = A*A if and only if CB — BC = 0. □ 

Note that for a normal operator A the operator AA* = A*A 
is expressed by the formula 

AA* =B^ + 

similar to the formula for the square of the modulus of a 
complex number. 



206 


^emesfer i 


If in some orthonormal basis an operator A has a diagonal 
matrix A, then in the same basis the adjoint operator has 
a complex conjugate and transposed, and hence also diago¬ 
nal, matrix. Since any two diagonal matrices commute, 
so do the operators A and A*. This proves that in a unitary 
space any orthogonally diagonalizable operator is normal. □ 
Our immediate aim is to prove the converse. To do this 
we shall try to extend to the case of normal operators Propo¬ 
sitions 3 to 5 of the preceding lecture. 

Proposition 3 of Lecture 19 cannot of course be directly 
generalized to normal operators, since the eigenvalues 
(= characteristic roots) of a normal operator may be any 
complex numbers. Its analogue for normal operators is the 
following proposition from which incidentally Proposition 3 
of Lecture 19 immediately follows for unitary spaces: 

Proposition 3. Any eigenvector of a normal operator A 
belonging to an eigenvalue X is an eigenvector of the adjoint 

operator A* belonging to an eigenvalue X. 

Proof. If the operator A is normal, then for any vector x 

(Ax, Ax) = (A*Ax, x) = (AA*x, x) = (A*x, A*x), 

i.e. 

I Ax 1 = 1 A*x |. 

Since every operator of the form A — is normal, as well 

as the operator A, it follows (as (A — XE)* = A* — KE) 
that for any X 

I (A-A,E)xl-| (A*-^E)x|. 

Therefore, if (A - A,E) x = 0, then (A* — IE) x = 0. □ 
Proposition 4 of Lecture 19 remains unaffected for normal 
operators: 

Proposition 4. Any two eigenvectors x and y of a normal 
operator A belonging to different eigenvalues X and [x are 
orthogonal. 

Proof. If Ax = Xx, then (Ax, y) = X (x, y). Similarly, 
if Ay = [xy and hence, according to Proposition 3, A*y = 

= [xy, then (x, A*y) = (x, |xy) = |x (x, y). Consequently, 
X (x, y) = (Ax, y) = (x, A*y) = |x (x, y) and therefore 
(x, y) = 0 (for by hypothesis X=f<^ \i). □ 




Leclure ^6 


On the contrary, Proposition 5 of Lecture 19 is in general 
false for normal operators: there exist normal operators 
having invariant subspaces with noninvariant orthogonal 
complement (construct an example!). For proper subspaces 
it proves to be true, however; 

Proposition 5. The orthogonal complement S'i of an arbit¬ 
rary proper subspace of a normal operator A is invariant 
under A. 

Proof, If X G , then (x, y) = 0 for any vector y G 
Therefore (Ax, y) = (x, A*y) = (x, ?^y) = % (x, y) = 0, for 
according to Proposition 3 A*y = Xy, Consequently Ax G 
□ 

As was already noted in the preceding lecture, it is only 
this property of the operator A that is necessary in the 
proof of Proposition 6. Therefore this proposition remains 
valid for any normal operator, which, in view of Proposi¬ 
tion 4, ensures the orthogonal diagonalizability of the 
operator. 

We have thus proved the following theorem: 

Theorem 5. In a unitary space a linear operator is orthogo¬ 
nally diagonalizable if and only if it is normal, □ 

This theorem allows the properties of a normal operator 
to be reduced to those of its spectrum. For example, it is 
now obvious that in a unitary space a normal operator A is 

(a) Hermitian, 

(b) invertible, 

(c) idempotent {i,e, A^ = A) 

if and only if its eigenvalues are respectively 

(a ) real, 

(b') nonzero, 

(c') equal to zero or unity. 

Note that the implications (a) =>• (a'), (b) => (b'), and 
(c) =:^ (c') hold for any linear operators. The inverse— 
most interesting—implications, however, hold only for 
normal operators (construct corresponding examples!). 

Of course, similar statements about the equivalence of 
the properties hold also for symmetric operators in a Eucli¬ 
dean space. 


Lecture 21 


Positive operators • Isometric operators • Unitary matrices • 
Polar factorization of invertible operators • A geometrical 
interpretation of polar factorization • Parallel translations 
and centroaffine transformations • Bringing a unitary operator 
into diagonal form • A rotation of an n-dimensional Euclidean 
space as a composition of rotations in two-dimensional planes 


Proposition 1. The following properties of a linear operator A 
are equivalent, in a Euclidean or a unitary space f ": 

(a) There exists a self-adjoint operator B such that 

A = B2. 

(b) There exists a linear operator C such that 

A = C*C. 

(c) The operator A is self-adjoint and (Ax, x) ^ 0 for 
any vector x G 5^* 

(d) The operator A is self-adjoint and all of its eigenvalues I 
are nonnegative. 

Also equivalent are the strengthened variants of these prop¬ 
erties resulting when we require in (a) and (b) that the opera¬ 
tors B and C should be invertible, in (c) that (Ax, x) > 0 
for X = 7 ^ 0 , and in (d) that all eigenvalues should be positive. 

Proof, Implication (a) (b). It suffices to put C = B. 

Implication (b) (c). If A = C*C, then (Ax, x) = 

= (Cx, Cx) = I Cx p ^ 0. Moreover, if the operator C is 
invertible and hence Cx = 7 ^ 0 for x = 7 ^ 0 , then (Ax, x) > 0 
for X = 7 ^ 0 . 








Lecture 21 


209 


Implication (c) (d). If Ax = Xx, then (Ax, x) = X (x, x), 

and therefore if (Ax, x) is nonnegative (positive), then X 
is nonnegative (positive). 

Implication (d) (a). Let ei, . . be a basis con¬ 

sisting of eigenvectors of the operator A and let X^, . . ., X^ 
be the corresponding eigenvalues. Since under the hypothesis 

X^^O, . . ., ^ 0, then there exist roots ]/. . . 

. . y Xj^ (in iR). We define the operator B by the formulas 

( 1 ) Be^ == y • • • > ~ y Xji Bji. 

It is clear that B^ = A. □ 

Definition 1. An operator A is said to be nonnegative 
if it possesses properties (a) to (d). If the operator A possesses 
the strengthened properties (a) to (d), it is said to be positive. 
Every self-adjoint operator B satisfying the relation B^ = A 
is called a square root of the operator A. A nonnegative 

(positive) square root is designated ]/ A. 

Formula (1) shows that there does exist an operator V A 
and that it is uniquely defined for any nonnegative {positive) 
operatar A. □ 

It is obvious that a nonnegative operator is positive if and 
only if it is invertible, □ 

In a Euclidean space a self-adjoint operator A is positive 
if and only if a square functional (Ax, x) is positive definite. 

Note that in a number of textbooks and monographs non¬ 
negative operators are called positive, while positive operators 
are called strictly positive. 

Positive operators are the analogues of positive real 
numbers. Now let us consider operators that^ are the ^ana¬ 
logues of complex numbers whose modulus is equal to unity. 

Proposition 2. The following properties of a linear opera¬ 
tor A are equivalent, in a Euclidean or a unitary space T ": 
(a) For any two vectors x, y 6 have 

(Ax, Ay) = (x, y). 

(b) For any vector x^f^ we hawe 

I Ax 1 = I X |. 


14-01325 



210 


Semester 2 


(c) For any orthonormal basis ei, ..., 6^1 of a space T" 
the vectors Aei, . . Ae^ dlso constitute an orthonormal basis 
of that space, 

(d) For the elements a\ of the matrix of an operator A, 

in an arbitrary orthonormal basis eji of a space TF 

there are relations 

n 

( 2 ) = 

k=i 

if the space T' is Euclidean and relations 

n 

(3) = f, 7 = 1, 

k=\ 

if the space TT is unitary. 

(e) We have 

A*A - E. 

(f) The operator A is invertible and 

A-i = A*. 

(g) We have 

AA* = E. 

(h) For the elements a\ of the matrix of an operator A, 

in an arbitrary orthonormal basis ei, of a space T 

there are relations 

n 

(4) ^ df^afi = i'l i 

if the space TT is Euclidean and relations 

n 

(o) ^1 a'fiOfi = 6^^, i^ i 1, , • . ^ n^ 

k=i 

if the space is unitary. 








Lecture il 


211 


Proof, We shall prove that the following implications 
hold: 


II 


(b) 


(c) <o=>(d) 


Implication (a) (b). It suffices to put y = x. 

Implication (a) (c). Since (Ae^, Ae^) = (e^, e^), we have 

(Ae^, Aey) = 6^;, if (e^, e^) = 6^;. 

Implication (b) (e). If 1 Ax | = lx], then 

((A*A — E) X, x) = (A*Ax, x) — (x, x) — (Ax, Ax) — 
— (x, x) = 1 Ax 1^ — 1 X 1^ = 0 and hence A*A = E (in 
a Euclidean space T', because the operator A*A — E is 
symmetric, and in a unitary space by Proposition 1 
of Lecture 18). 

Implications (c) <=> (d). By definition Ae^ = a^e;. There¬ 
fore 

(ACi, Ae;) = S 

A=1 

in a Euclidean space and 

A 

n 

(Acj, Ae;)= 2 OdO-) 

in a unitary space. Hence (c) (d) and (d) (c). 

Implications (a) (e). By definition (A*Ax, y) = 

= (Ax, Ay). Therefore (a) (e) and (e) =4^ (a) (since for 

some operator C and any vectors x and y we have (Cx, y) = 
= (x, y) if and only if C = E). 

Implications (d) <=> (e) and (g) (h). An operator A* 

has a matrix (aj) in a basis ei, . . ., e^. Hence elements 
of the matrix of the operator AA* are the sums ^a\ai and 

k 

elements of the matrix of the operator A*A are the sums 
• Therefore (d) <:> (e) and (g) (h). 

h 

Implications (e) =>' (f) and (g) =4^ (f). See implications 
1° 5° and 3° =>- 5° of Proposition 2 in Lecture 14. 


14 * 




2l2 


Semester i 


Implications (f) (e) and (f) =^- (g). Hold by defini¬ 

tion. □ 

Definition 2. In a Euclidean or a unitary space f’ a 
linear operator A is said to be isometric if it possesses prop¬ 
erties (a) to (h). Isometric operators are also called ortho¬ 
gonal in a Euclidean space and unitary in a unitary 
space f\ 

Property (a) implies that an operator A preserves scalar 
products (and hence, in particular, also angles), i.e. is 
a homomorphism (in fact, by virtue of (f), even an isomor¬ 
phism) of a space f' onto itself. 

Note that any isometric operator is normal (A*A = AA*). □ 

As we know (Proposition 4 of Lecture 14 in [1]), real 
matrices possessing properties (2) or (4) are exactly orthogonal 
matrices. By analogy matrices with complex coefficients 
possessing properties (3) and (5) are unitary matrices. For 
these, the following analogue of Proposition 4 of Lecture 14 

in [1] holds (the symbol designates a transposed matrix 
all the elements of which have been replaced by complex 
conjugate numbers). 

Proposition 3. A matrix A = (a|) of order n, with complex 

coefficients, is unitary if and only if it has one {and hence all) 
of the following properties: 

(a) The matrix A is a transition matrix connecting two 
orthonormal bases of an n-dimensional unitary space, 

(b) The columns of the matrix A constitute an orthonormal 
family of vectors of a unitary space 

(c) We have 

A'^A = E. 

(d) The matrix A is invertible and 

jT. 

s 

(e) We have 

AA^=E. 

(f) The rows of the matrix A constitute an orthonormal 
family of vectors of a unitary space 

Proof, Let us introduce a linear operator A that has a 
matrix A in some orthonormal basis. Then properties (a) 


] 






Lecture 21 


213 


to (f) turn into properties (c) to (h) of the operator A in 
Proposition 2. □ 

I^Since det = det A , it follows from properties (c) 
and (e) that 

det A I = 1 

for any unitary matrix A. 

It is obvious that all unitary matrices of order n form 
a group. This is called a unitary group and designated U (w). 
Its subgroup consisting of unimodular (det A = 1) matrices 
is designated SU (w). 

Proposition 4. In a Euclidean {unitary) space any in¬ 
vertible operator A is uniquely decomposed as a product of an 
isometric operator U and a positive operator P: 

(6) A = PU. 

Proof. According to Proposition 1 an operator A*A is 
positive and therefore there exists a positive square root 

p-Va^ 

Let U = AP-L Then U* = (P*)-iA* - P“^A* (for the 
operator P is self-adjoint) and therefore U*U = P"^A*AP“^= 
= P”^P^P"^ = E, Thus A = UP, where the operator U is 
isometric and^the operator P is positive. 

If UP = VQ, where U, V are isometric operators and P 
and| Q are positive^^operators, then PU* = QV* and there¬ 
fore 

P2 = PU*UP = QV*VQ = Q2. 

Hence (a positive square root is extracted uniquely) P = Q 
and therefore U = V. This proves that decomposition (6) is 
unique. □ 

Decomposition (6) is usually called the polar factorization 
of an operator A. It is similar to the decomposition re^^ = 
= r (cos (p + i sin (p) of an arbitrary complex number as 
a product of its modulus r and a number equal in absolute 
value]to unity. 

Recall (see Lecture 26 of [1]) that an affine transformation 
of an affine space is its arbitrary automorphism, i.e. 



214 


Semester 2 


a transformation defined by equating coordinates in two 
affine coordinate systems. If in the space ji an initial point 
0 is chosen, then an arbitrary affine transformation carries 
a point with a radius vector x over into a point with a radius 
vector of the form 

(7) y = Ax + b, 

where A is some invertible linear operator and b is a fixed 
vector (this is but a different way of writing formula (2) 
of Lecture 27 in [1]). 

Similarly, an orthogonal transformation of a Euclidean 
point space % is its transformation defined by equating 
coordinates in two Euclidean (rectangular) coordinate 
systems. It can be written using the same formula (7) but 
now with an orthogonal operator A. 

By analogy we can introduce unitary point spaces % as 
affine spaces into whose associated vector space the struc¬ 
ture of a unitary vector space is introduced. Automorphisms 
of such spaces are unitary transformations that can be written 
using formula (7) with a unitary operator A. 

Since any Euclidean (or unitary) point space is, in partic¬ 
ular, affine, it makes sense to speak of its affine transforma¬ 
tions (7). To a polar factorization A = UP of an operator A 
there corresponds then a representation of an affine transfor¬ 
mation (7) as a composition of an affine transformation 

(8) y = Px 

and an orthogonal (or unitary) transformation 

y = Ux + b. 

In appropriately chosen rectangular coordinates transforma¬ 
tion (8) can be written as 

yi ~ 


Vn — hjiXj^ , 

where > 0, . . ., > 0, and hence is a composition of n 

compressions toward n mutually perpendicular hyperplanes. 
This proves that any affine transformation of an n-dimen¬ 
sional Euclidean {unitary) point space is a composition of an 


1 









Lecture 21 


215 


orthogonal (unitary) transformation and n compressions toward 
n mutually perpendicular hyperplanes, □ 

For w = 2 this statement makes the content of Proposi¬ 
tion 1 in Lecture 27 of [1]. 

For A = E transformation (7) has the form 

y = X + b 

and is called a (parallel) translation to the vector b. For 
b = 0 transformation (7) has the form 

y = Ax 

and is called a centroaffine transformation. It leaves fixed 
a point 0 called its centre. Any affine transformation is 
a composition of a translation and a centroaffine transfor¬ 
mation. 

We stress that transformation (7), with b = 7 ^ 0, may well 
he a centroaffine transformation (with centre other than 0), 
For this to happen, it is necessary and sufficient that there 
should exist a vector Xq (the radius vector of a centre) satis¬ 
fying the relation 


Xo = Axo + b, 

i.e. such that (A — E) Xq = b. In particular, this is neces¬ 
sarily so if the operator A — TS, is invertible, i.e. if the num¬ 
ber 1 is not an eigenvalue of the operator A. 

An orthogonal transformation that is a centroaffine one is 
called a generalized rotation. It is called simply a rotation 
if the orthogonal operator A is unimodular (orientation¬ 
preserving). 

To get at least a primary idea of rotations we must study 
orthogonal operators in greater detail. To this end it would 
be convenient first to’'consider unitary operators. 

Proposition 5. The^spectrum of an arbitrary unitaryjopera- 
tor A lies, in the plane of a complex variable, on a unit circle, 
i,e, the absolute value of any characteristic root X of a unitary 
operator is equal to unity: 

\x\ = \. 



216 


Semester 2 


Proof. Any characteristic root X is an eigenvalue over 
the field C, i.e. there exists a vector Xq = 7 ^ 0 such that Axq = 
= Xxq. Then 

(Xo, Xo) = (Axo, Axo) = (Xxo, Xxo) = U(xo, Xo) 
and hence XX = 1. □ 

Theorem 1. For any unitary operator A there exists an 
orthonormal basis in which the matrix of the operator A is of 
the form 



Proof, A unitary operator is normal and hence orthogonal¬ 
ly diagonalizable. This, together with Proposition 5, proves 
the theorem. □ 

Now let A be an orthogonal operator in a (real) Euclidean 
space T*. 

We define its complexification 

A^ (x + jy) = Ax + iAy 

which is (see Lecture 17) a linear operator on the complexifi¬ 
cation 

T^=T + iT 

of the space T*, 

For any vectors 

z = X + iy 6 5 ^"^, Zj = Xi 4 -iy, 6 5^'^ 

we set 

(z, Zi) = [(x, Xi) + (y, yi)] — i [(x, yi) — (xi, y)]. 

A routine check shows that the functional z, i-> (z, Zi) 
is sesquilinear, Hermitian and positive definite, i.e. may 
be taken as a scalar multiplication in the complex vector 

space Under this multiplication the space is thus 

a unitary space. 







Lecture 21 


217 


Further, since 


(A'^z, A'^z,) = 

= [(Ax, Axi) + (Ay, Ayi)]—i[Ax, Ayj) 
= [(x, xi) + (y, yi)] —n(x, yi)- 


-{Axi, Ay)] = 

(xi> y)] = (z, Zi), 


the complexification of the orthogonal operator A is a 

unitary operator. Therefore, in particular, the operator A*'^ 
is diagonalizable. 

It follows (see Theorem 1 of Lecture 17) that in'the spa¬ 
ce there exists a basis in which the matrix of the operator A 
is a direct sum of first order matrices of the form X and second 
order matrices of the form 


(-: :)• 

The real numbers X are characteristic roots of the operator 

and therefore \ X \ = i, i.e. X = ±1. As to the numbers 
a, p, they are the real part nd the coefficient of the imagin¬ 
ary part |of the nonreal characteristic /root X = of the 

operator A^ and hence a = cos (p and p = sin (p, where 
—n < (p ^ and (p ^ 0, 

Since the matrices 

c:) - (-: _:) 

are also of the form 

( cos (p sin w\ 

— sin (p cos (p / 


(for (p = 0 and cp = jt respectively), it follows that in some 
basis 01 , . . of the space T the matrix of an orthogonal 
operator is a direct sum of m matrices of the form (9) {with 
—Jt < <p ^ Ji) and one first order matrix (±1) in the case 
n == 2m + 1, and either a direct sum of m matrices of the 
form (9) or a direct sum of m — 1 of such matrices and a 
matrix of the form 

C -?) 

in the case n — 2m. □ 


21B 


Semester 2 


According to the construction described in Lecture 17, 
a basis ei, . . of the space T is obtained from some 

basis ep, . . of the space having the following 

two properties: 

(a) every vector ef is an eigenvector of the unitary ope¬ 
rator A^; 

(b) if an eigenvalue Kq = to which the eigenvector 
belongs is real (i.e. (pq = 0 , jt), then so is the vector e^, 
and if 0 < (Pg < Jt, then the vector is complex conju¬ 
gate to the vector belonging to the complex conjugate 

eigenvalue Xq = 

Also 

{ eg, if (P (7 = 0 , JX, 

eg + jeg+i if 0 < (pg < Jt 

Gg-.! —^eg if — Jt<(pg<0 

Moreover, in addition to properties (a) and (b) we may 
assume the basis ef, . . ., to be orthonormal (since the 
operator is diagonalizable orthogonally). Since 

[ when cpg^O, Jt, 


r 

= { 




when OCcpgCiJt, 


C 

^< 7 - 1 

2i 


when ~ Jt <: (Pg < 0, 


the following equations hold 

(ep, Cg) = 0, 


f 1 

(ep, Cp) — 12 


P ^ 

cpp = 0, Jt, 
cpp —0, Jt• 


Consequently, if all vectors ep with cp^ = 7 ^ 0, jt are multiplied 

by y 2, an orthonormal basis results. Since, as is easily 
seen, the matrix of the operator A remains unaltered under 
the operation, we have proved the following theorem. 

Theorem 2. For any orthogonal operator A,,in an n-dimen- 
sional Euclidean space T there exists an orthonormal basis 


I 








Lecture 21 


219 


in which its matrix^ for n = 2m + 1, has the form 
( 10 ) 


cos (pi sin (pi 
— sin (pi cos (pi 



cos (p 2 sin (p 2 1 
“ sin (p 2 cos cp 2 j 


0 


cos cp 7>2 sm 
sin cos (pfn 


where e = ±1 and, for n = 2m, either the form 


(11) i 


j cos cpi sin (pi 
j — sin (pi cos ^)^ 


< 

1 

cos (p 2 sin (p 2 
— sin (p 2 cos (p 2 


0 


0 


cos (Pm Sin (pm 
sin (Pm cos (pm 


or the form 

j COS (pi sin (pi i 
i—sincpi cosfpj 

» 


\ 


0 



\ 


0 


: coscp^_j sincp^j^-i 

i ^m-1 COS(Pyj2_i 



1 0 

0 -1 


□ 



























220 


Semester 2 


Note that the determinant of matrix (10) equals e, the 
determinant of matrix (11) is positive (equals 1) and the 
determinant of matrix (12) is negative (equals —1). 

In terms of orthogonal transformations of point spaces 
Theorem 2 means that any rotation of an n-dimensional 

Euclidean space is a composition of rotations in m ^ 

mutually perpendicular two-dimensional planes and that any 
generalized orientation-reversing rotation is a composition 
of some rotation possessing an axis {i.e, a straight line all 
points of which remain fixed) and a reflection in a hyperplane 
perpendicular to that axis. For n = 2m + 1 any rotation 
possesses an axis, whereas for n = 2m there exist rotations 
without axes (these are rotations (11) for which cpp 0, jx, 
with any p = 1, . . m). 

Since a rotation without axes (more precisely, the cor¬ 
responding orthogonal operator in an associated vector 
space) has no eigenvalues equal to 1, its composition with 
any translation is again a rotation but with a different 
centre. A similar statement for rotations possessing axes is 
true if and only if the translation vector is parallel to none 
of the (many possible) axes of rotation. It follows that any 
motion of a Euclidean space is a screw motion, i.e. a composi¬ 
tion of a rotation and a translation to a vector parallel 
to some rotation axis. 







Lecture 22 


Smooth functions • Smooth hypersurfaces • Gradient • Deriv¬ 
atives] with respect to a vector • Vector fields • Singular 
points of a vector field • A module of vector fields • Potential 
and irrotational vector fields • The rotation of a vector field • 
The divergence of a vector field • Vector analysis • Hamilton's 
symbolic vector • Formulas for products • Compositions of 
operators 


The space IR^ of row vectors is not only a numerical model 
of w-dimensional affine or Euclidean spaces but also the 
domain of functions F (xi, . . Xn) of n variables. Here 
geometry is closely interwoven with mathematical analysis 
(function theory) and becomes practically indistinguishable 
from it. It is no wonder therefore that one of the earliest, 
and at the same time one of the most important, rigorous 
definitions, or what is said to be explications^ of the intuitive 
notion of a curve in the plane, of a surface in three-dimen¬ 
sional space and, in general, of a hypersurface in an M-di- 
mensional space was given in analysis. 

That definition proceeds from viewing a hypersurface 
(for 7Z = 2, a curve) as a “locus” of points whose coordinates 
satisfy a condition of the form 

(1) F {Xi, . . Xn) = 0. 

Since we want to explicate the notion of a “smooth” curve 
or a surface having no fractures, it is natural to assume the 

function F to be a differentiable function of class i.e. 
a function having (automatically continuous) partial deriv¬ 
atives of all orders. It is usual, however, to use in practice 
(in proving theorems) mostly derivatives of the first and 
second orders and only seldom those of higher orders. There- 




222 


Semesfer i 


fore, in order not to violate the general-mathematical 
principle—not to introduce unessential propositions—we 
assume the function F to have continuous partial derivatives 
only up to some order k ^ 1 inclusively. Moreover, in 
order to get rid of the irksome need to see to it that nowhere 
derivatives of higher orders should be used we shall not 
specify the order k, i.e. we shall simply require that all 
functions should have continuous partial derivatives of all 
the orders we shall need. For brevity we shall call such 
functions smooth functions. 

The smoothness condition is of a local character and may 
fail at isolated points. To take this into account we shall 
consider equations of the form (1) not in the whole of 
but in some open set C/ ci (for example, in an open ball). 
The set of all functions x (x) defined and smooth at 
all points x — . . ., Xn) 6 U will be designated 

It is obviously a ring and an (infinite-dimensional) vector 
space over the field R. 

For the simplest smooth functions (for example, poly¬ 
nomials) the sets given by condition (1) correspond quite 
well as a rule with the intuitive notion of surfaces, although 
often not in the entire space but only in some open set 
of the space. Therefore the opinion prevailed for a long 
time that the sets given by conditions of the form (1) with 
a smooth function F are more or less capable of pretending 
to the role of hypersurfaces (of curves, for n = 2). And it 
came so much the more as a surprise when about forty years 
ago the American mathematician Whitney proved the 
theorem which states that for any closed set C there 

exists a smooth {class C^) function F in such that F {x) = 0 
if and only if x ^ C. (It is easy to see that for the function F 
to exist it is necessary that the set C should be closed; it is 
a surprise that it is also sufficient that the set should be 
closed.) We shall prove the theorem in the third semester’s 
lectures, and now we shall only give an example. 

Example. The function F given by the formula 




Lecture 11 


223 


where | x = Yxl + , . • x^j belongs to the class 
in the whole of Moreover, the setj of all points x 6 
for which F (x) = 0 is a ball (or a disk for n = 2) | x | < 1. 

Whitney’s theorem explains why the condition of smooth¬ 
ness of the function F has to be supplemented with addi¬ 
tional conditions. The regularity condition known from the 
course in analysis is that at any point of hypersurface (1) 
the vector 


grad F = ^ 


dF 

dxi * 


OF 

dxji 


) 


(the so-called gradient of the function F) should be nonzero, 
i.e. that at least one partial derivative 



dF dF 

dxi ’ * * dxyi 


should be nonzero. Thus we arrive at the following defini¬ 
tion. 

Definition 1. A set of all points x = {xi^ . . Xj^) 
of an open set U that satisfy the equation 

(3) F (x) = 0, 


where F is a function smooth in £7, is said to be a smooth 
(or regular) hypersurface in U if at every point x ^ Sf at 
least one partial derivative (2) is nonzero. 



Points in the space R’^”^ will be designated by symbols 
of the form: x, y, . . . and so on. And for any point x = 
= (aji, . . Xn-i, Xn) 6 R” the symbol x will designate 




224 Semester 2 

a point {xi, . . a;„_i) 6 Accordingly for any set 

C c= R” the symbol C will designate the set of all points 
X 6 R”"^. where x 6 C. Instead of x = (xi, . . a;n-i» *n) 
we shall also write x = (x, a:„). 

Recall that a graph of a smooth function x„ = (p (x) given 
in an open set V 6 is a set of all points of the form 

(x, (p (x)) 6 !R”- It is clear that any graph is a smooth hyper¬ 
surface for which C/ = V X R cr R” and F (x) = (p (x) — 
— x„, since ^ (x) = —1 for any point x 6 □ 

The converse is certainly false. For example, the circle 
x2 -f y2 = 1 in the plane is not the graph of any function. 

Nevertheless it will be a 
graph in the neighbourhood 
of each of its points (the 
graph of the function y = 

= ]/ 1 — [x^ in the neighbour¬ 
hood of say the point (0,1), 
the graph of the function 

y = — y in the neigh¬ 

bourhood of the point (0,—1), 
and the graph of the func¬ 
tion a: = — y^ in the 

neighbourhood of the point 
(1, 0); in the last case the 
role of the coordinate Xn is 
played not by the coordinate 
y but by the coordinate x). 
It turns out that a similar statement is true for every 
hypersurface Sf ^ i.e. up to an interchange of coordinates 
any hypersurface (3) is the graph of some smooth function in 
the neighbourhood of each of its points. This statement con¬ 
stitutes the geometrical content of the following theorem 
known from analysis. 

Implicit function theorem. Let U be an open set, 

Xo = • • M 6 U be some point in it, and F: C/ 

be a Smooths functzon zn XL (i.e. fronz m such that 

F(xo) = 0 and ^(xo)^O. 







Lecture 22 


225 


Then in the space there exists a neighbourhood UqCZ U 
of the point Xq and a function = (p (x) defined and smooth 

in the neighbourhood U^cz of a point x© = . . . 

. . Xn^i) such that 

(a) <p (xo) = 

(b) (x, q) (x)) 6 /or any point x 6 

(c) if X = (x, Xn) ^ Uqj then F (x) ^ 0 if and only if 
a:„ = <p (x). □ 



Since the graphs of smooth functions of one and two 
variables seem to fully correspond with the intuitive notion 
of smooth curves and surfaces, the implicit function theorem 
shows that the explication given by Definition 1 of the 
concept of a hypersurface at any rate is not at variance 
with intuition. Moreover, the class of smooth hypersurfaces 
is wide enough to be distinguished. 

The restriction to the space 01” is of course unessential 
here: the coordinate isomorphism ->-01” transfers the 
concept of a smooth hypersurface to an arbitrary yi-dimen- 
sional affine (or Euclidean) space It is clear that the 
requirement of correctness (of the independence from the 
choice of coordinate isomorphism) is met here. 

The situation is different with the concept of a gradient. 
For its definition (transferred to a space ^”) to be correct, 

5-01325 





226 


Semester 2 


it is necessary (and sufficient) that under any change of 
the coordinates 

( 4 ) . 

^InVi ^nnyn 

partial derivatives (2) should transform by the vector law 
(as vector components). It is easy to see, however, that 
this is not so. 

Indeed, under transformation (4) the function F {xi, . . . 

• • M ^n) goGS iiito the function 

^ {Vli * • •» Un) ~ 

— P {^ilUl ~t“ ^nil/ny • • • y ^inUi • • • 4“ ^nnyn) 


and, according to the indirect differentiation rule, 

dyt ^ dxj • 
j=l 

This formula implies (see Lecture 4) that when coordinates 
(4) are changed the partial derivatives (2) transform as 
covector coordinates. Thus, from this point of view, we 
must consider a gradient grad F to be a covector. 

But in analysis the space !R^ is tacitly assumed to be 
Euclidean, with a standard scalar multiplication (x, y) = 
= ^lyi + • • • + ^nJ/n hence covectors are identified 
with vectors. One should not forget, however, that “in fact” 
a gradient is a covector, since this may (and does actually) 
lead to errors. 

Partial derivatives are a special case of what are called 
derivatives with respect to a vector^ defined for any vector 
k 6 01^ by the formula 


dF 

dk 


(x) = lim 
^-►0 


F(x + tk) — F(x) 


i.e. by 


dF . . _ dF (x + ^k) 
dk ^ ^ dt 








Lecture 12 


227 


If k = (ki, . . kn), then, according to the indirect differ¬ 
entiation rule, 


dF 

dk 



dF 

dxn ’ 


i.e. 

(5) SradF). 


It is usual, however, to consider only the case where 
I k 1 = 1, i.e. where the vector k is a unit vector. In that 

case the derivative — is also called a derivative of the 

dk 

function F with respect to the direction of the vector k. In this 
terminology partial derivatives are none other but deriv¬ 
atives with respect to the direction of coordinate axes. 

dF 

According to formula (5) the number ^ attains maximum 

(with I k I = 1) when the vector k is a unit vector of grad F, 
The vector grad F is therefore said to have the direction of the 
swiftest growth of the function F. 

Note that formula (5), although involving scalar multi¬ 
plication, does not in fact assume any Euclidean property. 
Indeed, its right hand side is obviously none other but the 
value of the gradient grad F, regarded as a covector, on the 

d F 

vector k. As to the derivative its definition does not 

dk ’ 

assume any Euclidean property at all. 

Of course the vector grad F in general changes from point 
to point, i.e. is a vector-valued function in U. Such func¬ 
tions are called “vector fields”. We shall give a general 
definition of them. 

Definition 2. Every family X consisting of n functions 

X Xj (x), J == 1, ..., w. 


where x = {xi, . . ., x^) 6 U, is called a vector field in U. 
A vector field is said to be smooth if all functions X^ are 
smooth. 

Formally a vector field in U is none other but a smooth 
mapping U -->!R^. 

15 * 



228 


Semester 2 


We have defined a vector field “in an analytical spirit*\ 
i.e. in the space 01’^ with fixed coordinates Xi, . . Xn- 
In a similar definition for an arbitrary affine (or Euclidean) 
space one should require that at every point the values 
of functions should transform by the vector law when 
coordinates are changed. We shall not consider such vector 
fields in however, since they possess a conceptual defect 
(as yet hidden from us) and their “proper” definition (with 
which we shall deal in the lectures of the third semester) 
is in fact somewhat different. 

Yet we venture to write for clearness 

(6) X = + . . . + Xnen 

meaning by ei, . . a standard basis (1, 0, . . 0), . . . 

. . ., (0, . . ., 0, 1) of the space [R”. 

In particular, in that notation 

1 in dF . , dF 

gradi?=-^e,+ ...+—e„. 

Definition 3. A point Xq 6 Is said to be a singular point 
of a (smooth) vector field X if (xq) = 0 for any i = 
= 1, . . i.e. if X (xq) = 0. 

We stress that the field remains smooth at a singular 
point. 

Thus we can say that a set of points x^ U for which F (X) = 
= 0, where F is some smooth function^ is a smooth hypersurface 
if it does not contain any singular point of the field grad F, □ 

This set, however, is said to be a hypersurface also when 
it does contain singular points, provided there are “not too 
many” of them (otherwise, by virtue of Whitney’s theorem, 
an arbitrary closed set may result). It is usual to assume 
that those singular points (called, incidentally, singular 
points of the hypersurface F = 0) are isolated or, at worst, 
fill one or several “surfaces of lower dimension”. 

Example. The gradient of a quadratic form 

F {x) = 4- ... ..., ^ 0 

is expressed by the formula 

grad F = (^X^x^y . . ., 2A/j^Xjj) 






Lecture 22 


229 


and has a singular point only at the zero (0, . . 0). There¬ 

fore a nonsingular second degree hypersurface 

"1“ • • • “h — 1 

(an ellipsoid or a hyperboloid) has no singular points^ i.e. 
is a smooth hypersurface in the sense of Definition 1. 

On the contrary, a second degree cone 

has a unique singular point, the vertex (0, . . ., 0). 

A cylinder over a cone 

has an n — r-dimensional plane Xi ~ 0, ,,,, Xj. = 0 of 
singular points. 

Vector fields can be added: 

{X + Y)t (x) = X, (x) + Vi (x), ^ = 1, . . ., M, 

and multiplied by functions: 

(JX)i (x) = / (x) Xi (x), i = i, . . n. 

An automatic check shows that under these operations the 
set X (U) of all smooth vector fields in U is a module over 
the ring {U), □ 

It is appropriate to give one general-algebraic definition 
here. 

Let A be an arbitrary ring and 5UI some module over the 
ring A. A family . . ., of elements of the module 
is said to be its basis if for any element m there exist 
uniquely determined elements . . ., ^ A such that 

m = Ximi -f . . . -j- 'krJ^n- 

Unlike vector spaces (modules over a field), not any module 
over a ring A has a basis. Modules for which there is a basis 
are called free. If all the bases of a free module ^ consist 
of the same number n of elements, the module is said to 
possess a rank and that rank to be equal to n. In general, 
there are rings over which there are free modules possessing 
no rank, but such rings are necessarily noncommutative 




230 


Semester 2 


(try to prove it!). Therefore, in particular, any free module 
possesses a rank over a ring F {U), □ 

In formula (6) every vector may be interpreted as a 
vector field all of whose components are identically zero, 
except the iih component which is identically equal to 
unity. Then the formula will imply that the fields ei, . . . 
. . ., constitute a basis of the module 3C (U). This proves 
that for any open set U cz the module X {U) of vector 
fields in U is a free module of rank n over the ring ^ (C/). □ 
Moreover, the module 3F {U) is obviously, just as the 
ring ^ (C/), a vector space over the field K (of infinite 
dimension). 

The mapping F i—^ grad F of the ring F {U) into the 
module X {U) carries, as can easily be seen, a sum over 
into a sum and a product by a number into a product by 
a number, i.e. is a linear mapping (a homomorphism) of the 
vector space (U) into the vector space (U), It acts 
on the product of functions, as follows directly from the 
formula for differentiating a product, by the formula 

(7) grad: (FG) = F grad G + G grad F. 

It is obvious that the kernel of a linear mapping 

grad: F k-► grad F 


consists of locally constant functions^ i.e. functions constant 
on each connected component of a set U, 

The image of a mapping grad does not in general coincide 
with SF (C/). 

Definition 4. A vector field of the form grad F is called 
a gradient, or potential, field. If Z = grad F, then the 
function F is said to be a potential of the field X, The poten¬ 
tial (if there is one) is uniquely determined up to a locally 
constant function. 

The vector field (6) is said to be irrotational if 

( 8 ) 

Oxj 


for any i, j = I, . . ., n everywhere in C/. It is easy to 
see that every potential field is irrotational. Indeed, if Xi = 


d£_ 

dxi 


, then 


d]^i 

dxj 


d^F 
dxi dxj 


and 

OXi OXj OXi 


but, accord- 




Lecture 22 


231 


ing to the familiar property of mixed partial derivatives 
d^F _ d^F 

dxfdxj dxjdxi * 

In analysis, instead of the vector field (6) one often prefers 
to consider the differential expression Xi dxi + -‘.+ 
. . . Xn dxn, and then conditions (8) are necessary for 
that expression to be a total differential dF of some func¬ 
tion F. We shall return to this matter in the third semester’s 
lectures. 

Generally speaking, the necessary conditions (8) are 
insufficient i.e. not any irrotational field is potential. This 
is so only for the simplest domains C/, such as the interior 
of a ball or cube. But for arbitrary domains U cz the 
dimension of the factor space 

(vector space of irrotational fi 0 lds)/(vector space of 

potential fields) 

may serve as a measure of their complexity. This remark 
will also be expounded in the third semester’s lectures. 


For n — 3 irrotational fields can be described in a more 
convenient manner. From here (and to the end of the lecture) 
we assume that n = 3, By tradition vector fields in 01^ 
will be designated u, v, . . . and so on. The components of 
a field u will be designated (also by tradition) P, Q, i?, 
the coordinates x^^ x^ in R® as a:, z, the coordinate 
unit vectors e^, eg, eg as i, j, k and the vector x\ y} -\- 

+ zk as r. As before U will designate an open set ci ill® 

in which all our fields and functions are defined (and smooth). 
Note that in this notation] 



grad F ■= 


dF. dF . , dF 
dx dy ^ ^ dz 


Definition 5. The rotation rot u of a vector field 

u = Pi + ^3 + i?k 


is a vector field 




232 


Semester 2 


The symbol curl u was formerly used instead of rot u, 
but now it has gone out of use*. 

It is clear that the mapping 

rot: S* {U) {U) 

is a homomorphism (a linear operator). Its kernel consists 
exactly of irrotational fields, and the statement that any 
potential field is irrotational implies that 

(11) rot grad F = 0 

for any function F 6 ^ {U). 

Example 1. A field of the form 

u = / (r) r, 

where r = | r | and / is an arbitrary smooth (provided 
r > 0) function, is called a central field. It is defined and 
smooth everywhere, except for the point (0, 0, 0). 

For that field 

P = f (r)x, Q = f (r)y, R = f (r) z. 

On the other hand, differentiating the formula r = 

= Y we immediately have 

dr _ X dr _ y dr _ z 

dx r ^ dy r ^ dz r * 


Hence 

dP 

dx 

( 12 ) 


/' (r)^f/(r),^-f (r)^, 
dQ 




dx 


dy 

dy 


^ = ff (r) — 

dz ^ ^ ' r ’ 




=r(r) 


yz 


dR t, , V xz 

-w=^ ’ 


dR 

dy 




dR 

dz 


= /' {r)^ + f{ry 


* 


The usage in the USSR is meant by the author.— Tr, 






Lecture 22 


233 


Therefore, in particular, 

dQ __ dP 

dy dz ^ dz dx ^ dx dy ^ 

i.e. rot u = 0. Thus every central field is an irrotational 

field. Moreover, it turns out that every central field is poten^ 
tial. Indeed, by setting 

r 

F (r) = j rf (r) dr 
1 

we immediately have u == grad F. □ 



The velocity field of a plane rotation 

If in particular / (r) = 1/r^ and hence | u | = 1/r^ (the 
gravitational field of a material point), then (up to a con¬ 
stant) F (r) = —1/r (the Newtonian potential). 

Example 2. Let 

P = —Uy Q = X, R = 0 

{the velocity field of a plane rotation). Then 

rot u = 2kf 




234 


Semester 2 



Under multiplication by functions we have for the opera¬ 
tor rot 

(13) rot (Fu) = F rot u + grad F X u 

which can be checked by direct computation. 

Here by a vector product of two fields we naturally mean 
a field resulting when we have performed vector multiplica¬ 
tion of those fields at every point. 

A field which is a rotation, i.e. has the form rot u, is 
called solenoidal (derived from the Greek word solen, tube). 
If i; = rot u, then the field u is called the vector potential 
of the field v. It is uniquely determined up to a term which 
is an irrotational field, i.e. has the form grad F, in the 
simplest domains U, 

Definition 6. The divergence div u of a vector field 


is the function 


u = Pi -f + i?k 



div u = 


dx 



dz 


The field u is said to be a field without sources if the function 
div u is identically zero. 

Example 3. For a central field u = / (r) r we have (see 
formulas (12)) 

(15) div u = 3/ (r) + rf (r). 

When / (r) = l/r^ this expression is equal to zero. Thus 
the force field of a Newtonian potential has no sources. 

An automatic computationjshows that 

div rot u = 0 


for any field u 6 ^ Thus every solenoidal field is a 
field without sources, □ 

Again the converse is true only for fairly “simple” domains, 
and again the dimension of the factor space 

(vector space of fields without sources)/(vector space of 

solenoidal fields) 

Qan serve as a measure of complexity of a domain 





Lecture 22 


235 


The mapping 

div: S{U)^^ (C7) 

is obviously linear and, as is shown by a direct check, 

(16) div (Fn) = F div u + u grad F 

for any function F (JJ) and any field n ^ S (U). 

Thus we have defined three linear mappings: 

.r (U) —^ sr (U) —U. 3c {U) jp (U) 

possessing properties (7), (13) and (16) and such that com¬ 
positions rot o grad and div © rot are zero. 


The theory of these linear mappings is known as vector 
analysis. It plays an especially important role in the theory 
of electromagnetism in physics. 

Every electromagnetic field (for example, light) is given 
at each point of a medium by two vectors, an electric vector 
E and a magnetic vector H. These vectors depend not only 
on the point, but also on time t and are completely defined 
if we know the electric charge density p and the vector field j 
of current density. Equations relating E and H to p and j 
have (in the corresponding system of units) the form 


divE = 4jxp, 


rot E = 


1 dE 
c dt ^ 


divH = 0, 


rotH = 


1 5E , 4ji 




where c is the velocity of light. These equations called the 
Maxwell equations underlie the entire theory of electro¬ 
magnetism and, in particular, that of optics and radio 
engineering. As a matter ^of fact, vector analysis was first 
developed as a tool for investigating these equations. How¬ 
ever, it has been successfully used say in continuum mechan¬ 
ics as well, and is of course of no small purely mathematical 
importance. 

The most important chapters of vector analysis are con¬ 
nected with the so-called integral formulas which we shall 
discuss in the third semester’s lectimes. For the time being, 
however, we shall consider only the simplest formulas of 
vector analysis that use no integrals. 



236 


Semester 2 


To derive these formulas, it is appropriate to introduce 
what is called Hamilton's symbolic vector field 




k. 


Assuming that a product of say ~ by a function P is a 

dP 

partial derivative ^ , we may consider the right hand side 

of formula (14) defining the function div u as a scalar prod¬ 
uct of a field V by a field u. Thus 

div u = Vu. 

Similarly a field rot u may be represented as a vector prod¬ 
uct 


rot u = V X u, 


which, incidentally, allows us to write for rot u a beautiful 
determinantal expression: 


1 


rot u = 


dx 


P 



k 

dz 

R 


Finally, by allowing the numerical factor to be written 
at the right of a vector a field grad F can also be represented 
as a product of V by F: 

grad F = 


Now let a and B be either functions or vector fields. 
Then they can be multiplied together in many different 
ways (for example, if a and b are vector fields, by perform¬ 
ing scalar or vector multiplication). Let * and * be two 
multiplications such that the expression ^Vi^(ci 26 ) is well- 
defined. 

The familiar rule of differentiating a product may be 
formulated as follows: we differentiate the product twice, 
differentiating only one factor at a time and then adding 






Lecture 22 


237 


both results together. It is fairly clear that the same rule 
also applies to an action by the operator V. Therefore 

(17) Vx* (a*b) = Vx* {ath + Vf (^b), 

where the vertical arrow marks the factor acted upon by 
the operator V. 

Let, for example, a and 6 be functions F and G (and 
hence let * be the multiplication of functions and * the 
multiplication of a vector field by a function). Then 

S^{FG)=V{FG) + ^{FG). 

But it is clear that V (FG) = F (VG) and similarly V(/^G) = 
= G (VF)- Hence 

V (FG) = F (VG) + G (VF). 

It is the familiar formula (7). 

If a is a function F and 6 is a field u, then formula (17) 
yields 

V (Fu) = V (T^n) + V (^u) 


V X (Fu) = V X (Fu) + V X (Fu). 

I i 

In the first formula V (Fu) = F (Vu) and V (Fu) = (VF) u, 
so that 

V (Fu) = F (Vu) + (vF) u. 

It is formula (16). 

I 

Similarly, in the second formula V X (Fu) = F (V X u) 

I 

and V X (Fu) = (vF) X u and hence 

V X (Fu) = F (V X u) + VF X u. 

It is formula (13). 




23d 


Semester 2 


Finally, if a and 6 are fields u and v, then three new 
formulas result: 


(17a) V (uv) V (uv) + V (uv), 

I 

(17b) V (u X v) = V (u X v) + V (u X v), 

* I 

(17c) V X (u X v) = V X (u X v) + V X (u X v). 

Formula (17b) is the easiest to decipher. Indeed, using 
the properties of a triple product we immediately get 

II I 

V (u X v) = Vuv= —uVv= —uVv= —u (V X v) 

and 


14 ’ 

V (u X v) = Vuv = vVu = vVu = V (V X u). 


Hence formula (17b) is equivalent to the formula 
(18) div (u X v) = (rot u) V — u rot v. 


Of course that “hence” is highly arbitrary, since we have 
in no way substantiated the validity of applying the prop¬ 
erties] of aj triple product to products containing a symbolic 
field V- Such a substantiation would lead us too far away 
besides requiring supplementing with a more detailed 
justification of the original formula (17) which strictly speak¬ 
ing was assumed above virtually without proof. We are 
therefore justified in regarding all the foregoing as nothing 
but mnemonic or at best heuristic considerations combining 
in a single formula, (17), the hitherto entirely unrelated 
formulas (7), (16), (13) and (18). As to the formal proof of 
these last formulas (and, in particular, of the new formula 

(18) ), nothing remains but to check each of them indepen- 
(iently by direct calculation. 

The possibilities of formula (17) are not restricted to 
the four formulas listed: we have not yet deciphered the 
two symbolic formulas (17a) and (17c). We shall need the 
following lemma to transform them. 

Lemma. For any three vectors a, b, c the formula 

(19) c X (a X b) = (cb) a — (ac) b 


is valid. 




Lecture 22 


230 


Proof. Choose an arthonormal basis i, j, k such that the 
vector a is collinear with the vector i and the vector b is 
coplanar with the vectors i and j. Then 

a = a^i, 

b = 6 ii + 

c = Cii + ^23 + 

and therefore 

a X b = (ai 62 ) k, 

c X (a X b) = {aib 2 C 2 ) i — (^ 162 ^ 1 ) J- 
On the other hand* 

cb = cibi + ~ 

and therefore 

(cb) a — (ac) b = {c^b^ + Cgfeg) ^li — (^^ 2 ^ + ^ 2 j) = 

= (^2Mi)i — (^3^1^62) j* 


Hence c X (a X b) = (cb) a — (ac) b. □ 

We shall apply the lemma to the case where one of the 
factors is a vector field V> i-e. again merely for purely heu¬ 
ristic-mnemonic purposes. Besides, to obtain the right for¬ 
mulas we have to give another value to the expression aV, 
where a = Ai + Bj + Ck is some vector field, different 
from the one, Va = div a, suggesting itself, i.e. to give 
up the commutativity of scalar multiplication in the case 
of a symbolic vector field V» 

Namely, we shall consider the expression aV to be an 
operator acting on a vector field u = Pi + ^3 + Pk by the 
formula 


(aV)u-^-^+5 


dy 



dR 
dz • 


Adopting this convention we get, in view of (19), 

vx (ux v) = (Vv)u—(uV) V = (divv)u*—(uv) V, 
i 

V X (u X v) = (vV) u — (Vu) V = (vV) u — (div u) V, 


240 


Semester 2 


and thus formula (17c) yields 

(20) rot (u X v) = (vV) u — (uV) v + (div v) u — 

— (div u) V, 

To transform in a similar manner formula (17a) we apply 
formula (19) after rewriting it in the following form: 

c X (a X b) = a (cb) — (ca) b. 

Then we get 

U X (V X V) == V (uv) —(uV) V 


and 


I 

V X (V X u) = V (uv) — (vV)u 


and therefore 

t I 

V (uv) + V (uv) = u X (V X v) + V X (V X u) + 

+ (vV)u+(uV) V. 

Thus formula (17a) yields the formula 

(21) grad (uv) = u X rot v + v X rot u + (vV) u + 

+ (uV) V, 

Of course, formal proofs of formulas (20) and (21) must 
as before consist in direct calculations. 


Interesting relations hold for compositions of operators 
grad, rot and div as well. 

As we already know 

rot o grad = 0 , 

div o rot == 0. 


Note that an operator V reduces these formulas to the asser¬ 
tion that a vector product of two equal vectors is zero: 

rot (grad F) = V X (VF) = (v X V) = 0 

and 

div (rot u) = V (V X u) = (V X V) u = 0. 





Lecture 22 


241 


Of special interest is the operator 

A = div o grad 

which may be regarded as the scalar square of a Hamil¬ 
tonian operator V* This operator is called a Laplacian 
operator. It is an operator from into and acts by 

dx^ ^ dy^ ^ dz^ • 

It is used to write the most important equations of mathe¬ 
matical physics, to which a separate course is devoted in 
the curricula of universities. 

A function F is said to be harmonic if AF = 0. An example 
of a harmonic function is the Newtonian potential F — 
= —1/r (see above). As will be shown in the course in 
mathematical-physics equations, any harmonic junction is 
the potential of the gravitational field of some mass. This alone 
shows the important role played by harmonic functions 
in physics (and hence also in mathematics). 

The operator A can be applied to vector fields as well 
by acting with it on every individual component: if u = 
= Pi -f 0 -f- i?k, then 

Au = (AP) i + {^Q) j -f {^R) k. 

Then the following formula holds 

rot o rot = grad © div — A. 

Indeed, according to the lemma 

rot (rot u) = V X (V X u) = V (Vu) — (W) u. □ 


It is possible to set up other differential expressions as 
well. For example, for any two functions F and G the scalar 
product of their gradients is defined: 


A (P, G) = grad F grad G = 


dF_^ . OF dG 

dx dx ^ dy dy ^ dz dz 


16-01325 



242 


Semester 2 


It is called BeltramVs mixed differential parameter of the 
functions F and G. In particular, when F = G we obtain the 
scalar square of a gradient: 

It is called BeltramVs first differential parameter of the 
function F. 

The triple product of the gradients of three functions is 
called Darboux^s differential parameter. This term, however, 
is almost completely out of use, since the triple product is 
nothing but the Jacobian of a transformation defined by 
three given functions. 




Lecture 23 


Continuous, smooth, and regular curves • Equivalent curves* 
• Regular curves in the plane and graphs of functions • The 
tangential hyperplane of a hypersurface • The length of 
a curve • Curves in the plane • Curves in three-dimensional 
space 


Explicating the notion of a curve as the trajectory of a 
point we obtain the following definition. 

Definition 1. A continuous curve in an w-dimensional 
Euclidean (or affine) space § is a continuous mapping 

(1) x: ^ x(^) 

of some closed interval [a, b], a <. b, of the axis t in the 
space S (meaning that points of the space g are characteriz¬ 
ed by their radius vectors with respect to a fixed point O). 

It makes sense to speak of the continuity of mappings of 
the form (1), since the Euclidean space g is a metric space. 
It can easily be shown (do it!) that the continuity of map¬ 
ping (1) is equivalent to the continuity of n numerical 
functions 

(2) Xfi t Xi{t), i=l,.,-,n, 

where Xi (t), . . ., x^ (t) are the coordinates of a vector 
X (^) in an arbitrary basis. Since the basis is a priori in no 
way connected with any metric (is not orthonormal), we see 
that mapping (1) continuous in one metric is so in any other. 
This means that the continuity property of mapping (1) 
does not depend on any metric and is therefore an affine 
property. In other words, it makes sense to speak of contin^ 
uous mappings of the form (1) also when % is an affine 



244 


Semester i 


space (and hence is not a metric space). Cf. Definition 1 
of Lecture 12 in [1]. 

The reader must already know Definition 1 (as applied 
to the space R^) from the course in analysis. 

We stress that according to this definition a continuous 
curve is a mapping, not a set of points. Nevertheless, one 
uses such terminology referring to curves as if they were sets. 
Thus curve (1) is said to pass through a point Xq if there exists 
(generally speaking, more than one) value of the parame¬ 
ter t such that X (to) = Xq. The point x (a) is called the 
initial point of curve (1) and the point x (6) is its terminal 
point. Also curve (1) is said to connect the point x (a) to the 
point X (6) and so on. 

The set of all points of curve (1), i.e. the image of the 
Interval [a, b] under mapping (1), is sometimes called 
the support of curve (1). 

Definition 1 was proposed as early as the last century 
by the French mathematician Jordan who was certain (and 
this certainty of his was shared by all mathematicians) 
that it reflects fairly well the intuitive notion of a curve. 
But soon all mathematical world was astounded at the news 
that the Italian mathematician Peano had constructed 
a continuous curve that passes (several times, in fact) 
through each (!) point of a square. It became clear that the 
continuity condition alone is not enough and that some 
other, additional conditions are necessary. 

In our previous lecture we introduced the concept of a 
function F smooth on some open set U a Now consider 
an arbitrary set C cz R’^ and some function / given on C. 
We say that the function / is a function smooth on C if there 
exists an open set C/ c: R^ and a function F smooth on U 
such that C Cl U and 

i = F\c. 

In particular, a function t^x (^) given on the interval 
[a, 6] will be said to be smooth if on some open interval 
containing the closed interval [a, b] there exists a smooth 
function coinciding on [a, b] with the function x (i). 

Definition 2. Mapping (1) is said to be a smooth curve in % 
if the coordinate functions (2) are functions smooth on 
[a, 6]. 




Lecfure 23 


245 


It is obvious that this definition is correct (is independent 
of the choice of coordinate system).^ 

For any smooth curve (1) and any t ^ [a, 6], there exists 
a limit 




lim 

At~*0 


X (t + At) —X (i) 

fA^ 


This limit is called the tangent vector of (or to) curve (1) 
at the point t (or at the point x (^)). Its coordinates are 
obviously the derivatives 

(4) x\ (t), ..., a:; (f) 


of coordinates (2) of the vector x (^). Vector (3) is also desir 
nated by the symbol 

This construction may obviously be iterated any number 
of times to yield vectors x'' (^), x'" (f), etc. whose coordinates 
are the corresponding derivatives of the coordinate func¬ 
tions (2). 

It is easy to see that if for some continuous curve (1) 
limit (3) exists, then so do derivatives (4). Thus the smooth¬ 
ness condition of curve (1) is the existence condition of any 
derivatives x' (t), x"(t), . . . (we need). This shows once 
again that the smoothness condition is independent of the 
choice of coordinate system. 

On the whole smooth curves (more precisely, their sup¬ 
ports) already correspond^ to the intuitive idea of a curve. 
At any rate a smooth curve, as we shall show in the third 
semester’s lectures, cannot pass through all the points of 
a square (and what is more, the set of all of its points, i.e. 
its support, is what is called a “set of measure zero”). It 
may be, however, of not a “smooth” character at all points 
and possess “cusps”, just as, say, the curve x = t^, y = 
in the plane does. To^avoid such pathologies we introduce 
the following definition: 

Definition .3. A smooth curve (1) is saidfto be regular 
if x' (t) ^ 0 for all [a, &]. 

Now a regular curve fully corresponds to the intuitive 
idea of a “smooth” curve. Before discussing this matter, 
however, we must consider yet another important question. 



246 


Semester 2 


From an intuitive, geometrically apparent point of 
view, the main drawback of Definitions 1 to 3 is that the 
“curves” they introduce are not sets. On the other hand, 
the definition of a curve as simply an image of the closed 
interval [a, &] under its continuous (smooth or regular) 
mapping into a space % turns out, for many reasons, to be 
quite unsatisfactory. The following definition is usually 
introduced to approach at least partly the intuitive-geo¬ 
metrical notion of a curve^and to obtain at the same time 
its efficient explication. 

Definition 4. Two curves 

x: t ^ x{t), x^: 

where a and ^ ^ respectively are said 

to be equivalent if there exists a function 

(5) (p: ^ i-> (p {t) 

such that (p (a) = ai, (p (&) = b^ and x {t) = Xj ((p (^)) for 
all t 6 [a, &]. Function (5) is said to effect a change of para¬ 
meter t. 

It is clear that equivalent curves have the same supports. 

Classes of equivalent curves are called non-parametric 
curves. Many authors (mainly of a more traditional slant) 
call them simply curves, referring to curves in the sense 
of Definitions 1 to 3 as parametric curves or paths. Intuitively 
transition to an equivalent curve means that without chang¬ 
ing the trajectory of a point we change the velocity with 
which it moves along the trajectory. It is clear that this 
change of velocity cannot be arbitrary. If for example we 
are considering continuous curves, in principle it is necessary 
to require that the function cp should effect a homeomorphic 
(one-to-one and bicontinuous) mapping of the interval 
[a, b] onto an interval [aj, i.e. that it should be a contin¬ 
uous and strictly monotonic function (then the inverse 
function exists and is also continuous). Otherwise the rela¬ 
tion between curves introduced by Definition 4 is not, gener¬ 
ally speaking, an equivalence relation on the set of all 
curves and will not therefore allow introduction of classes 
of equivalent curves. It is possible, however, to admit 
functions (5) that are not strictly monotonic and hence 
discontinuous inverse functions, provided the curve t 





Lecture 23 


247 


^ Xi (q) (^)) for the discontinuous function (p remains contin¬ 
uous. This means that the point is allowed to stop for 
a time in moving along the trajectory, and conversely if the 
point remained fixed, it is allowed to pass the place without 
stopping in the equivalent motion. Moreover, it is possible, 
by slightly complicating Definition 4, to admit any nonmo¬ 
notonic functions (5) too (thus allowing the point to retrace 
its trajectory). It is usual to discuss all these questions in 
detail in the course in analysis. But we shall restrict our¬ 
selves, in accordance with our general purpose, to regular 
changes of parameter^ i.e. to such functions (5) that are, 
first, smooth and, second, possess the property that 

cp' (^) > 0 for any'^^'^ 

This will ensure that the regularity properties are preserved 
under changes of parameter. 

One should not exaggerate the significance of the concept 
of a nonparametric curve, since, first, it is one order (“an 
extra equivalence”) more complex than the concept of 
a parametric curve and, second, even in spite of this it 
does not fully correspond to the intuitive idea of a curve 
as a set of points (curves may have the same support but 
fail to be equivalent). At the beginning of this century, of 
the two concepts of a curve that of a nonparametric curve 
was considered to be the basic one, as supposedly more 
apparent geometrically. In recent years, however, paramet¬ 
ric curves have more and more often come to the fore not 
only because they are simpler conceptually, but chiefly 
because it is these curves that tend to occur in real mathemat¬ 
ical constructions. In particular, this explains why the 
simple word “curves” formerly applied to nonparametric 
curves is now used more and more often to refer to paramet¬ 
ric curves. 

A role of no small importance is played of course also 
by the fact that many natural and convenient concepts and 
constructions are not preserved under equivalence and can¬ 
not therefore be defined for nonparametric curves. The 
situation is such for example with the concept of a tangent 
vector which is multiplied by cp' (i) when passing to the 
equivalent curve. Therefore even ardent advocates of the 
priority of nonparametric curves pass in practice to para- 



248 


Semester 2 


metric curves, adducing the “naturality” (see below) of the 
concept they are introducing, to excuse their fall. 

For these reasons the main subject of our study will be 
parametric curves and we shall pass to equivalent curves 
only sporadically and without attaching any significance 
o this. 

Now we are in a position to discuss the question of the 
extent to which the concept of a regular curve corresponds 
to the intuitive notion of a curve. For simplicity we shall 
restrict ourselves to the case of a plane. As always coordi¬ 
nates in the plane will be denoted by x and y. 

The graph of an arbitrary smooth function y == y (x) is 
the support of the regular curve 

X = t, y =ry {t) 

which we shall also call, loosely but quite naturally, the 
graph of the function y {x). 

What curves in the plane satisfy our intuitive idea of 
a “smooth curve”? It appears to be possible to require that 
the following conditions should be fulfilled: 

(a) the graph of any smooth function (with coordinate 
axes arbitrarily arranged) is a “smooth curve”; 

(b) a curve (regularly) equivalent to a “smooth curve” 
is a “smooth curve”; 

(c) a curve is a “smooth curve” if and only if it is a “smooth 
curve” locally, i.e. in the neighbourhood of any of its points. 

The smallest class of curves that satisfies these conditions 
consists of curves locally equivalent (i.e. equivalent in the 
neighbourhood of every point) to the graphs of smooth 
functions (changing from point to point). It is clear that 
all such ciurves are regular. It turns out (just this justifies 
from the intuitive point of view the distinguishing of the 
class of regular curves) that the converse is also true: any 
regular curve in the plane is locally equivalent to the graph 
of a smooth function. 

Indeed, if the curve 

(6) X = X (t), y = y (t), o < i < 6, 

is regular, then for any point to 6 [a, 6] either x' («#) =5^ 0 
or y' (to) 7^ 0. Let for definiteness x' (to) =5^ 0; Then by the 



Lecture 23 


249 


implicit function theorem (applied to the function F {x, t)= 
= X — X (t)) the function 1x (t) is locally invertible^ 
i.e. there exists the neighbourhood ot di point and the 
neighbourhood Vq of sl point Xq = x (to) such that the func¬ 
tion t X (t) gives a bijective mapping Uq the 

inverse function x ^ t (x) being smooth. Moreover, t' (x) 

=7^ 0 for all a: 6 1^0 smd hence if t' (x) >0 in Vq^ then the 
function x t (x) will effect a regular change of parameter 
for curve (6) in the neighbourhood Uq. That change converts 
curve (6) (in the neighbourhood Uq) into an equivalent 
curve which is (in Vq) the graph of a smooth function y = 
= y (t (x)). But if f {x) <0 in Fq, then it is necessary to 
take —X rather than x as the new parameter (i.e. to change 
the sense of the abscissa axis) and \lx (^o) = 0, then the new 
parameter will be y (or — y). □ 

Note that in this statement “locality” is understood 
“relative to a parameter”, i.e. the restriction of the curve 
to some neighbourhood of a point 6 b] is considered. 
For the neighbourhood of a point {x (^o)» y (^o)) the plane 
a similar statement is even meaningless. 

Example. The cmve 

_ 3^(l — — 

» y— 3(2_3(+i 

called a folium of Descartes passes through the point (0, 0) 
twice, at ^ = 0 and t = i. It is equivalent to the graph 
of some function y = y (x) in the neighbourhood of the 
point t = 0 and to the graph of a function x = x {y) in 
the neighbourhood of the point ^ = 1. But in the neigh¬ 
bourhood of the point (0, 0) in the plane the curve (or more 
exactly its support) is a union of these two graphs. 

For a folium of Descartes the point (0, 0) is what is called 
a point of self-intersection. The graphs into which the folium 
of Descartes breaks in the neighbourhood of the point (0, 0) 
are called its branches. We shall not dwell on phenomena of 
this kind since in what follows we confine ourselves to 
a local study of curves on sufficiently small intervals of the 
axis t (i.e., consequently, when they are equivalent to 
graphs) and merely remark that it is because of the presence 
of self-intersections that regular curves (or more exactly 
their supports) will not be regular hypersurfaces in the plane 



250 


Semester 2 


in the sense of Definition 1 in the preceding lecture. However, 
we cannot all the same state as yet, of course (outside the 
limits of local consideration) that the support of every curve 
in the plane that has no intersections is a regular hypersur¬ 
face and that, conversely, any regular hypersurface in the 
plane is the support of a regular curve (automatically with¬ 
out a self-intersection). In the third semester’s lectures we 
shall investigate such questions in their natural generality 

and therefore leave them 



A folium of Descartes 

Definition 5. A vector a i 


undiscussed for the time 
being. 

In the spirit of all the 
other terminology relating 
to curves we shall say that 
curve (1) lies (or is) on the 
hypersurface 

(7) F (x) = 0 

of a space $ ifF(x (^)) = 0 
for any t 6 b], i.e. if hy¬ 

persurface (7) contains the 
support of that curve, 
said to be the tangent vector 


of (or to) hypersurface (7) at its point Xq if on the hypersur¬ 
face (7) a curve ^ x (^) passing through the point Xq, 

with t = t^, lies such that a is the tangent vector of that 
curve at the point i*®* if a = x' (^o)- 
Let T* be a vector space associated with the point space 
g (Euclidean, for definiteness) and let be the set of all 
vectors tangent to hypersurface (7) (assumed to be regular) 


at its point Xq. 

Proposition 1. The set is an n — l-dimensional sub¬ 
space of a space T consisting of all the vectors orthogonal to 


the vector grad F (xq): 


= {a 6 ; a grad F (xq) = 0}. 


Proof, If a 6 then there exists a curve ^ i-> x {t)y 
a ^ t ^ b, in such that 

(8) /^(x(^))=0 for all t^[a^b] 


i. 



Lecture 23 


251 


and 

(9) Xo = x(^o), a=x' (^o). 

But the formula, known from analysis, for a derivative of 
the composite function 

F (x (t)) = F (xi (t), . . (0) 

may be written in the form 

^ x' (t) grad F (x (t)) 

(we naturally assume the coordinates Xn rectangu¬ 
lar). Differentiating relations (8) and putting ^ we 

therefore get, by virtue of (9), 

(10) a grad F (xq) = 0. 

Conversely, let relation (10) hold. Without loss of gene¬ 
rality we may choose a coordinate system so that the vector 



A tangent vector 


grad F (xq) is parallel to the axis Ox^, Then the following 
relations will hold 


( 11 ) 


OF 

dx\ 


(Xo) = 0, 


dF 


dF 




(Xq) — 0, (Xq) "¥= 0 



252 


Semester 2 


d F 

and condition (10) will take the form = 0. Since ^— (xq) ^ 

oXn 

^ 0, in the neighbourhood of the point Xq hypersurface (7) is 
the graph of some smooth function 

x„ = (p(x). 

This means that 


.y.(0) _ f,. /^^O) ^(0) \ 

'*'71 — T \*^i » • • • » 

where Xn^ are the coordinates of the point Xq and 

F (x, cp (x)) = 0 


for all the points x = 6 belonging 

to some neighbourhood C/q of a point Xq 6 Differenti¬ 
ating the last identity with respect to Xn-\ and 

putting X = Xq we get by virtue of (11) 


d(p 

dxi 


(Xo) = 0, 


d(p 


(xo) = 0. 


Now let 6 > 0 be a positive number so small that when 

I ^ I ^ 0 the point Xq + af, where as ever a = (ui, . . . 
• • •» ^n-i)» lias in Cq, Then the formulas 

x(i) = Xo + a^ ^n(0 +a0 


will define in % some curve t ^ x (t), | i 6 lying on 
hypersurface (7) and passing for ^ = 0 through the point 
Xq. In addition 

yv ^ 

x' (0) = a 


and 

Xn(0) 


dtp (xp + aQ 
dt 


f=0 

/^ \ 1 


d(p 

dxji^i 


A 


(Xo) «n-l 



i.e. x'n (0) = a„. Consequently, x' (0) = a and hence a g 





Lecture 


253 


Definition 6. The hyperplane of a space % passing through 
a point Xq and parallel to the subspace is called the 
tangential hyperplane of (or to) hyperplane (7) at the point Xq. 

According to Proposition 1 a tangential hyperplane has 
the equation 

(x — Xo) grad P (xo) = 0, 


i.e. the equation 

(■917)0 ^*~**^'^ ■■■”*' 

The vector grad F (x©) is orthogonal to the hyperplane. 
For n = 2 we obtain the statement known from the course 
in analysis that in the plane the tangent to an arbitrary 
curve 

F {x, y) =0 

at its regular point (xq^ yo) has the equation 

(■£) 0 (-f") 0 ° • 


The length of a continuous curve (1) is known from the 
course in analysis to be the limit (if there is one) of the 
lengths of broken lines inscribed into that curve (we assume 
the space % to be Euclidean). For a smooth curve (1) this limit 
always exists (the curve is said to be rectifiable) and is ex¬ 
pressed by the integral 

h 

(12) f I x' {t) I dt. 

a 


As a matter of fact the definition of length as the limit of the 
lengths of inscribed broken lines is never recalled (at least 
for smooth curves) and only formula (12) is used. The sim¬ 
plest thing therefore is to accept integral (12) as the defini¬ 
tion of the length of a smooth curve and to consider the 
reasoning involving broken lines as the definition’s heuristic 
motivation. This is the way in which we proceed in the third 


i 


254 


Semester 2 



semester’s lectures in similar but more involved situations 
(for example, when defining the area of a surface). 

Let 

t 

(13) s (^) = ( I x' (t) I dt 

a 

be the length of a segment of curve (1) from a to t. If 
curve (1) is regular, then 

s'{t) = I a;' (0 I > 0 

and therefore a change of parameter t ^ s {t) is possible. 
Thus any regular curve is equivalent to a curve whose parame¬ 
ter is an arc length. These last curves are usually said to be 
referred to the natural parameter s. 

In what follows we shall always assume as a rule that all 
the curves considered are referred to the natural parameter. 
This is of no fundamental significance of course, but sub¬ 
stantially simplifies calculations. 

Differentiation with respect to s will be marked with a dot: 

• . dx(s) • • / V d^x (5) . 

According to formula (13), it t = s, then 

s 

j I X (s) I = 5 

a 

(and a = 0) from which it follows that 

I X (5) 1 = 1 for all s. 

Conversely if |x' (^) | = 1 and a = 0, then t = s. 

Lemma 1. Let s ^ u (s) be a vector-valued smooth function 
such that I u ( 5 ) 1 = 1 for all s. Then 

(14) u (5) u (5) = 0 for all s. 

Proof, It suffices to note that for a scalar product (as well 
as for a vector one) of vector-valued functions the usual rule 
for differentiating a product of functions taking on numeri¬ 
cal values is valid (since the usual proof remains completely 





Lecture 23 


255 


valid for this case too). Differentiating the equation u (s)^ = 
== 1 (and cancelling 2) we therefore obtain (14). □ 

In particular we see that 

• • • 

X (s) X (s) = 0 for all s. 

We shall make repeated use of this important formula. 

Let us consider a particular case of curves in the plane. 
Rectangular coordinates in the plane will as always be de¬ 
noted by X and y and a radius vector with these coordinates 
will be designated by the symbol r (instead of the symbol x 
used in the general case). In addition, for any curve r = r ( 5 ) 
in the plane (referred to the natural parameter s) we shall 
designate by the symbol t (s) the tangent vector of the curve 
at a point r ( 5 ): 


t (s) = r (s). 

According to the foregoing this vector is a unit vector and 

t (5) t (5) = 0 for any s. 

9 

Definition 7. The length of a vector t (s) is designated by 
the symbol k (s) and called the curvature of a curve r = r (5) 
at a point 5. 

Thus 


A: (s) = 11 (s) I = y (s) 4- (s). 

The ciurvature of a curve referred to an arbitrary parame¬ 
ter t is the curvature of an equivalent curve referred to the 
natural parameter. The formula for the curvature (which 
can be obtained by simple but rather awkward calculations 
using nothing but formulas for differentiation of functions) is 
rather involved: 

x"y' — y"x' 

The number k (s) may be interpreted as the instantaneous 
rotation velocity of the unit vector t (s). It is clear that 
this velocity is the greater the “more curved” is the curve. 
Hence the term “curvature”. 





256 


Semester 2 


Sometimes the so-calledj relative curvature /Crei is consid¬ 
ered (in an oriented plane), equal to curvature k if (with 

ft = 5 ^ 0) vectors t and t constitute a positively oriented basis 
of the plane, and to — k otherwise. We shall need this cur¬ 
vature in Lecture 25. 

Example 1. If 

X ( 5 ) = 0:0 + 5 /, y {s) = yQ-\- sm^ where Z^ + m^ + l, 

i.e. if the curve under consideration is a straight line, then 

• • • • 

0 : (5) = 0 and y ( 5 ) = 0. Therefore ft (5) = 0 for all 5, i.e., 
as was to be expected, the curvature of a straight line is 
identically zero, □ 

Since linear functions are, as is easily seen, unique func¬ 
tions, whose second derivative is identically zero, the 
converse is also true, i.e. a curve whose curvature is identically 
zero is a straight line (or its segment). □ 

The point Fq = r (^o) of a curve r = r (s) is said to be a 
point of rectification if k (sq) = 0 . 

Example 2. The parametric equations of a circle of radius 
R in the natural parameter s are obviously of the form 

a: = i?cos-^, 

Since 

•• 1 S •• 1 . s 

a:=__ cos^, -_sm 

we have 

(«) = -!■• 

Thus the curvature of a circle is constant and equal to the in¬ 
verse of its radius. □ 

The converse is also true: a curve with constant curvature is 
a circle (or a segment of a circle). □ 

This follows from the general theorem which states that 
for any function k = k (s) {defined and smooth on the interval 
1 5 I ^ 5o) there exists {if the number Sq is sufficiently small) a 
curve r = r (5), | 5 | < whose curvature is equal to ft (5), 
the curve being unique up to congruence. We shall not prove 




Lecfure 23 


257 


the theorem now since in our next lecture we shall establish 
its analogue for any n. 

If k (5) ^ 0, then the number i? ( 5 ) = rj-- is defined, called 

k\s) 

the radius of curvature of a curve at a point s. 

A curve r = r ( 5 ) is said to be a curve of the general type 
if there are no points of rectification on it, i.e. if k {s) ^0 
for all s. At each point of such a curve a unit vector 


n(s) = 


t w 

k{s) 


directed along the normal to the curve (i.e. along the straight 
line passing through the point of tangency and perpendicular 
to the tangent) is defined. 

For any s the vectors t (5) and n (5) form an orthonormal 
basis called the Frenet moving basis of a given curve. 

By definition 

t (5) = /c (5) n (5). 

We find a similar formula for the vector n (5). Let 

n (s) = a (s) t (s) + p (s) n (s) 

be an expansion of the vector with respect to the vectors of 

the basis t ( 5 ), n (5). Since t (5) n (5) = 0 we have t (5) n (5) + 

• • • 

+ t (s) n (s) = 0 and so a (s) = t (s) n (s) = —t (s) n (s) = 
= —k (s). On the other hand, by Lemma 1 p (s) = n (s) X 

X n (5) = 0. This proves that for any curve of the general 
type there are formulas 


(15) 


t (s) = k (s) n (5), 
n (s) = —k (s) t (5) 


describing the instantaneous rotation of the moving basis 
under a change of s. □ 

Formulas (15) are called Frenefs formulas for a plane 
curve. 

Now let us consider curves in three-dimensional space 
(with coordinates x, y, z and radius vector r of points). For 

17-01325 



• » 
Semester 2 


258 


any curve r = r (5) (referred to the natural parameter) its 

• _ 

tangent vector r (s) will as before be denoted by t (5). The 

• • 

magnitude | t (5) j of a vector t (s) for space curves is also 
called curvature and designated by the symbol k {s) as before. 
Thus 

k { s )^^ X (5)2 + y (5)24- z (5)2. 


A curve t = r (s) is said, as in the case w = 2 , to be a curve 
of the general type if k {s) ^ 0 for all s. For such a curve a 
unit vector 





called a vector to the principal normal to the curve is defined. 

But now (assuming the space to be oriented) we can intro¬ 
duce into consideration yet another, a third, vector b (5) 



Frenet's basis of a plane curve 


Ffenei*s basis of a space » urve 


constituting together with the vectors t (s) and n (s) a posi¬ 
tively oriented orthonormal basis t (5), n (5), b (s). This vector 
is called the binormal vector and the basis t (5), n (5), b (5), 
Frenefs moving basis of a given curve of the general type. 

By construction (we omit the argument s to simplify the 
formulas) 

t = kn. 


In addition, since b == t X n, we have 

b = tXn + tXn = tXn, 






Lecfure H 


259 


whence it follows that bt = 0. Since by Lemma 1 bb = 0^ 
this proves that the vector b is collinear with the vector 
i.e. there exists a number x = x (5) such that 

b = — xit. 


The number is called the torsion of a given curve at a point 
s. It is the rotation velocity of the vector to the binormal. 

Differentiating now the equations nt = 0 and nb = 0 

• • • • 

we at once see that nt = —nt = —k and nb = —nb = x. 

Since in addition n = 0 (Lem¬ 
ma 1) this proves that 

n = — kt 4 * 

Thus for any general type 
curve we have the formulas 

9 

t = /cn, 

(16) n=—A:t-|-xb, 
b= — xn. □ 



These formulas are called 
Frenefs formulas for a space 
curve. 


A circular helix 


Example 1. If a curve r = r ( 5 ) lies in a plane 11, then 

• • • 

vectors r (5) and r (5) are parallel to that plane (for this is the 


case for the increments r (5 -f As) — r (s) and r (s -f As) — 

• • 

— r (s) of the vectors r (s) and r (s)). Therefore t (s), n (s) H 11 
and hence b (s) _L H. This proves that b (s) = const and so 
X ( 5 ) = 0 for all s. Conversely, let x (s) = 0 for all s and 
hence b (s) = bo = const. Then (r (s) bo)* = t (s) bo = 0 for 
all s and therefore r (s) bo = const. This means that the 
curve r = r (s) lies in the plane rbo = const. Thus a curve in 
space is a plane curve if and only if its torsion is identically 
zero. □ 

Example 2. A circular helix is the path described by a 
point moving at a constant velocity along a generator of a 


17 # 






260 


Semester 2 


right circular cylinder rotating uniformly about its axis 
The equations of the helix are of the form 


We have 


whence 


X = a cos t, y = a sin z — bt. 


x' = —a sin y' — a cos z = 


s'= l/'(x')2 + (l/')2 4-{z')2 =y"a 2 + 62 . 


Thus s = ct, where c = l/a* + 6® and hence 


x = acos-^. y=^asin~, z=^s. 
c ’ ^ c ’ e 


Since 


x= — 


• e 

X 


a . s 
— Sin — 
c c 

a 


.2 


s 

COS — 


• a s 
y = — cos — 


•• a . s 

y = -Trs»n- 


c 

z — Oj 


we have 


k = ^ x^-\-y^ + z^ = -^ = const 

C 


and 


A d » S CL S m t b f, 

t =-sin — 1 H— cos — 3 H— , 

c c ' c c^c 


n= —cos — i —sin — j, 
c c 


b= t X n = 


a . s 
— Sin — 
c c 


— cos 


J 


a s 

— cos — 
c c 


— Sin 


c 


0 


h . s , 

— Sin — 1 
c c 


b S t I CL -m 

— cos — 3 H-k 

c c c 






Lecture 23 


261 


Therefore 



5 . , h 


. S . 

sm-j = 



n 


and so 


X = -4r = const. 

Thus the curvature and the torsion of a circular helix are con¬ 
stant, □ 

According to a general theorem, which we are going to 
prove in the next lecture, and conversely, every curve whose 
curvature and torsion are constant is a circular helix (or its 
arc)^ 




Lecture 24 


Projections of a curve onto the coordinate planes of the moving 
n-hedron^Frenefs formulas for a curve in n-dimensional spacer 
Representation of a curve by its curvatures •Regular surfaces* 
Examples of surfaces 


To investigate the behaviour of an arbitrary space curve 
r = r (5) near some of its points we choose the origin 0 in 
that point, choose vectors to, no, bo of the moving w-hedron 
in the point O to be the vectors of the basis i, j, k and count 
the natural parameter s off from O, Then 

r{0)=0, r(0) = t(, = i, r(0) =feono = *oj. 

r (0) = (k)(, Uo 4- + (*)o j + 

• • 
where fco, (fc)o and Xo are the values of functions k, k and x 
for 5 = 0. Hence, using the Taylor formula 

r (s) = r (0) -j- sr (0) +r (0) +-^'r (0) + ... = 

= (s—^s3+... 

+ (^S^ + 



This implies that near the point O our curve is given by the 
parametric equations 

= 5+ .. 

h 

2 

^ 0^0 


1/^452+..., 


5 ® -f- . . • 




Lecture 24 


263 


If fco = 7 ^ 0, Xo = 7 ^ 0, then the projection of the curve onto the 
plane Oij = OtoHo (incidentally, this plane is called the 
osculating plane of the curve at the point 0) approximately 
coincides with the parabola 


x = s, 




s 


2 . 


its projection onto the plane (?jk = Onobo (called the nor¬ 
mal plane of the curve at the point 0) does with the semi- 
cubical parabola 


y 


— c2 

“ 2 ^ ’ 


^ _ ^ 0^0 




and finally its projection onto the plane Oik = Otobo (called 
a rectifying plane of the curve at the point O) does with the 
cubical parabola 


X — 


z = 


2 


s^. 


This gives a fairly clear idea of how a space curve is con¬ 
structed near any of its points (at which curvature and tor¬ 
sion are different from zero). 

We now extend the results obtained in the preceding lec¬ 
ture to include the case of an arbitrary n. 

Let X = X (5), 1 5 I ^ 5 o be an arbitrary curve (referred to 
the natural parameter) in an w-dimensional oriented Eucli¬ 
dean space g. Assuming that for any s the vectors 

• (n-l) 

X(5), ...,X ( 5 ) 

are linearly independent (such curves are called curves of the 
general type) and applying to those vectors the Gram-Schmidt 
orthogonalization process we obtain an orthonormal family 
of vectors ti (5), . . {s).' Let t^ (5) be a vector (uni¬ 

quely defined) extending that family to a positively oriented 
orthonormal basis 


( 1 ) 


(^)» • • •> 


^n-l (^)» Ifi (^)* 



264 


Semester 2 


Definition 1. Basis (1) is called FreneVs moving basis of a 
curve X = X ( 5 ) of the general type at a point s. 

Let 


n 

t j i 1 ? • . . j W- 

i=i 

(we omit the argument s to simplify the formulas). Since by 
construction the vector i = 1, . . w — 1, is linearly 



Projection onto the os¬ 
culating plane 



Projection onto the nor 
mal plane 



Projection onto a rec¬ 
tifying plane 


• 

expressible in terms of vectors x, . . x, the vector is 

• 1 

linearly expressible in terms of vectors x, . . ., x. Since the 
last vectors are linearly expressible in terms of the vectors 
ti, . . this proves that au = 0 provided 7 > i + 1. 

On the other hand, since tjt^ = 6<y, we have t<t/ + 

= 0, i.e. 


= 0 * 

Therefore an = 0 and aij = 0 provided j i — 1. 

Thus only the coefficients can be non¬ 

zero. Setting 

^2 ^23? * * •> ^n-1 ^ ^n-l.n 




Lecture 24 


265 


we therefore see that the following formulas hold 









3 » 


fn-1 — ^n-2^W“2 “f" 7H 


These formulas are called Frenefs formulas for a curve in 
n-dimensional space. 

The functions ki = ki (s), . . kn^i = kn^i ( 5 ) are called 
the curvatures of a curve. They are defined, we stress, only for 
a curve of the general type. 

In the formulas 

. (i) 

(3) i = 1 

resulting from applying the Gram-Schmidt orthogonaliza- 
tion process the last coefficients are positive. Therefore 
in the reverse formulas 

(i) 

^ ~ • • • “i” 

the coefficients are also positive. Differentiating 

formulas (3) we get 

U == PilX + (*Pi2 + Pil) X + • . . 

. (i) (i + 1) 

• • • “1" (Pii + Pi, i-l) X-hP^fX, i=l, ...,n— 1. 

On replacing here (provided i a n — i) the vectors x, . . . 
(i+l) 

. . ., X by their expressions (4) we must get formulas (2). 
This shows that 

ki=^iiyi+i,i+i for any i = —2. 

This proves that for any curve of the general type the curvatures 

h h 

. . ., _2 

are positive. The curvature (the analogue of torsion), 
on the other hand, may have any sign. 








266 


Semester 2 


Now we show that any n — 1 functions 
(5) (s) 0, • • •) ^n-2 (^) ^n—1 (^) 


may serve as the curvatures of some curve and that these 
curvatures uniquely (up to congruence) determine the curve. 

Theorem 1. Let n — 1 smooth functions (5), all positive 
except possibly the last, be given on an arbitrary interval 1 5 | ^ 
^ Sq. Then for any initial point 0 and any positively orient¬ 
ed orthonormal basis there exists one and only one 

curve X = X (s), \ s \ ^ Sq of the general type possessing the 
following two properties: 

(i) the curvatures of the curve are the given functions (5); 

(ii) for s = 0 we have 


X (0) = 0, ti (0) = ii, 



n* 


Proof, We carry out the proof in stages. 

Stage 1. At this stage we use the following general theo¬ 
rem known as the theorem of the existence and uniqueness of 
solutions {BUS) of linear differential equations which will be 
proved in the third semester’s course in differential equa¬ 
tions. 

Theorem (EUS). Let m^ smooth functions Aij (s), i, j = 
= 1, , , ,, m, be given on an arbitrary interval \ s | ^ 5o 
and let x[^\ . . x^ be arbitrary numbers. Then there exists 
one and only one family of smooth functions Xi ( 5 ), . . ., Xm (s), 

I ^ I ^ ^ 0 ? possessing the following two properties: 

(i) identically by s, | 5 1 5o, the relations 



x^ — A\\Xi -f- ... -f- 


^ml^l “t" • • • “h ^mm^m 


hold’, 

(ii) for s — 0 we have 

( 0 ) \ . . •, ( 0 ) = □ 

We shall apply this theorem to relations (2) which for the 
given functions . . ., fcn-i are equations of the form (6) 
for m = n^ coordinate vectors ti, . . t^. Thus, according 

to the EUS theorem, there exists one and only one family 





Lecture 24 


267 


of vector-valued functions t^ = tj (5), . . t„ = (.9) on 

the interval | 5 1 ^ such that 

(i) for any s there are relations ( 2 ); 

(ii) for 5 = 0 there are 

( 7 ) ti ( 0 ) = ii, . . t„ ( 0 ) = i„. 

Stage 2. We consider scalar products t;, / = 1, . . . 

. . n. According to relations (2), for these products we 
have 

-\-ti ( — + kj tj-^.^) 

(we assume by convention that to = 0 and t^i+i = 0 ), i.e. 
the equations 

(8) (fit/) = —^i-i (ti-it/) +/Cj (tj + ity) — 

^j-i (ti tj_i) -|- kj (t^t^-^i) 

which may be regarded as equations of the form ( 6 ) for 

m = functions t^t^. By the EUS theorem therefore 

there exists only one set of these functions possessing the 
property that for 5 = 0 they are equal to 6 ^^- = ifij (i.e. 
to zero if i ] and to unity if i = 7 ). 

On the other hand, a direct check shows that equations (8) 
satisfy the functions t^t^- identically equal to 6^^. (Indeed, 
when i ^ ] — 1 , 7 + 1 all the terms of the sum — .7 + 

+ *ifii+i ,7 — * 7 - 1 ^ 1 , 7-1 + M^ 7 +i are zero and when i = 
= 7 — 1 , r -f 1 the sum has only two nonzero but mutually 
cancelled terms.) Hence for all 5 there are by virtue of the 
EUS theorem equations t^t; = 6 ^;, i, 7 = 1, ...» n, im¬ 
plying that for any 5 , | 5 1 ^ 5o, the vectors ti, . . ., 
constitute an orthonormal basis. 

Since for 5 = 0 that basis coincides with a positively 
oriented basis ii, . . ., the basis ti, . . ., is positively 
oriented for any s too. 

Stage 3. We compose consecutive derivatives of the 
vector ill 

• •• (n. — 1) 

(9^ t^, tj, t|^, ... tj. 



268 


Semester 2 


and apply to them the Gram-Schmidt orthogonalization proc¬ 
ess. Since the vector ti is a unit vector, we need not do 

• 

anything in the first step of the process. Since the vector ti 
is orthogonal to the vector ti (by Lemma 1 of the preceding 
lecture), in the second step we must only normalize it. 
Since according to what has been proved the vector ta is a 
unit vector and fci > 0 by the hypothesis, according to the 

first of the relations (2) | | = /Cj. In the second step there¬ 

fore we obtain the vector 



In the third step we should consider the vector 

ti = (^ 1 ^ 2 ) “ tj kjt2 -1- 

subtract from it the linear combination of vectors ti and tg 
to obtain a vector orthogonal to those vectors and then nor¬ 
malize the vector. But since according to what has been 
proved the vectors ti, tg, tg constitute an orthonormal family 
and by the hypothesis k^2 > 0? the result of this procedure 
is obviously the vector tg. 

It is clear that this reasoning is of a general character so 
that at each step of the orthogonalization process we obtain 
the corresponding vector tj, i — i, ..., 1. This proves 

that the family of vectors ti, t 2 , . . ., is uniquely 
characterized as an orthonormal family of vectors obtained 
from family (9) by the Gram-Schmidt orthogonalization 
process. 

Stage 4. Let 

i 

( 10 ) X (5) = j (5) ds, 

0 

Then x (0) = 0 and x (5) = ti (5), i.e. the curve x = x (5), 
\ s \ ^ Sq, begins at the point O and has at a point x (s) the 
tangent vector ti (5). But for every curve the first n — 1 
vectors of the moving basis are vectors obtained from the 
first n — 1 derivatives of the tangent vector by the Gram- 
Schmidt orthogonalization process. According to the fore- 




Lecture 24 


26d 


going therefore those vectors coincide with the vectors 
ti, . . tn—l* 

As to the last vector of the moving basis, it is uniquely 
characterized as unit vector constituting together with the 
first n — 1 vectors a positively oriented basis. Since the 
basis ti, . . tn-i, tn was seen to be positively oriented, 
that vector must be the vector t^. 

Thus we have proved that for any s the vectors ti (5), ... 

. . ., in (s) constitute the moving basis of the curve x = 
= X ( 5 ). Since for these vectors we have Frenet’s formulas (2), 
the functions ki (s), i = i, ...» n — 1, appearing in the 
formulas must be the curvatures of the curve x = x ( 5 ). 

This completes the proof of the existence of a curve x == 
= X ( 5 ) possessing properties (i) and (ii). 

The uniqueness of the curve follows from the fact that 
according to the EUS theorem the moving basis ti (5), . . . 

. . ., tn ( 5 ) is uniquely defined by equations (2) and the 
initial conditions (7) and the radius vector x (s) is uniquely 

defined (by formula (10)) by the relation x ( 5 ) = ti (s) and 
the initial condition x (0) = 0. □ 

By analogy with Definitions 1 to 3 of Lecture 22, for 
any k, 0 d k n, sl “parametric” definition can be given of 
a fc-dimensional surface in w-dimensional space. For simplic¬ 
ity we confine ourselves to the case where k — 2 and » = 3. 

Let W be an arbitrary open set in the two-dimensional 
space whose points are pairs {u, v) of real numbers. An 
arbitrary mapping W oi that set into a three-dimension¬ 
al Euclidean space g is given (if an origin O is chosen in 
g) either by a vector-valued function r = r (i^, i;) defined in 
W or (if rectangular coordinates Xy y, z are introduced in g) 
by three numerical functions 

(11) X = X {u, v)y y = y {u, v), z = z (w, u). 

As before, we shall consider only smooth mappings (w, v) ^ 
^ r {Uy v)y i.e. such that functions (11) are smooth in W, 
The partial derivatives 

(12) r^ = Xy,\ + y^] + r*, = Xj,\ + yj + 

will therefore be defined (we omit the arguments Uy v for 
simplicity). 



Semester 2 


270 


Definition 2. A mapping (u, i;) r (w, v) is said to be a 
regular surface if for any point {u, v) vectors (12) are 
linearly independent* 

The set of all points of g whose radius vectors are of the 
form r {u, v), {u, v) 6 is called the support of the surface 
r = r (zi, y). 

Recall from the course in analysis that the bijective map¬ 
ping W of an open set IF cz onto an open set 

Wi cz is said to be a diffeomorphism if the functions 

(13) m = Ui {u, u), Ui = Vi (u, v) 
and 

(14) u = u {ui, i;i), V = V {uij Ui) 

giving the mapping W Wi and the inverse transformation 
Wi -^W are smooth functions. For the bijective mapping 
given by smooth functions (13) to be a diffeomorphism it is 
necessary and sufficient that its Jacobian 



d {ui, Ui) 

d (w, v) 


dui 

du 

dvi 

du 


dui 

dv 

dvi 

dv 


should be nonzero everywhere in domain W. If, on the other 
hand, Jacobian (15) of functions (13) (which a priori are not 
assumed to give a bijection) is nonzero everywhere in W, 
then the mapping they give is a local diffeomorphism, i.e. for 
any point (uq, Vq) 6 W there exists a neighbourhood U cz W 
in which the mapping is its diffeomorphism onto some neigh¬ 
bourhood Ui cz Wi of a point {ui {uq, Uq), Vi {uq, Vq)) (this 
is the so-called inverse transform theorem). 

Now let us be given two surfaces: 


(16) r=r(u, v), (u, v) , 

and 


(17) r = ri(z/i, v^), 

Definition 3. Surfaces (16) and (17) are said to be equiva¬ 
lent if there exists a diffeomorphism 

(u, v), Vi = Vi {Uj v) 




Leclure i4 


2li 


of an open set W onto an open set Wi such that 

r {u, v) = Ti (ui {u, v), Vi {u, i;)) 
for any point (u, v) G W. 

It is clear that equivalent surfaces have the same support. 
Any smooth function z = z {x, y) of two variables defined 
in domain W gives by the formula 

r (u, v) = ui v] -\- z (u, v) k 

a regular surface called the graph of that function. 

It turns out that with an appropriate choice of coordinate 
axes any regular surface is locally equivalent to the graph of 
some smooth function. Indeed, since vectors and are 
linearly independent, at any point (i^o» 6 W the rank of 

the matrix 

l^u Vu 

Ui; Vv zj 

equals two, i.e. at least one of its minors of the second order 
is nonzero. For definiteness let 

Vu 

“ ^ ^ 0 , 

Vv 

Then, by the inverse transform theorem (applied to functions 
X = X {u, v) and y = y {u, v)), there will exist a neighbour¬ 
hood UqCzW of the point {uq, Uq) and a neighbourhood 
Vo d 01^ of a point {xq, i/o) 6 where Xq = x {uq, Vq), 
Uo ~ y (^o» ^o)» such that the functions x = x {u, u) and 
y y {u, v) effect a diffeomorphism of the neighbourhood 
Uo onto the neighbourhood Fq. Then if, 

u = u{x, y), V = u {x, y) 

are the functions effecting the inverse diffeomorphism, in 
the neighbourhood Uq the surface r = r {u, v) will be equiv¬ 
alent to the graph of the function z== z {u (x, y)^ v (x, y)), □ 
Although a surface is not a set, terminologically it is often 
identified with its support. Thus, for example, points of the 
support of a surface are called points of the surface and 
so on. 


272 


Semester 2 


In general a regular surface may be a noninjective map¬ 
ping into g (it may have points, curves and even entire do¬ 
main of “self-intersection”) but in this semester’s lectures we 
shall concern ourselves only with sufficiently small domains 
of it in which it is equivalent to the graph and hence is an 
injection. 

If a point Af of a surface has a radius vector r (w, i;), then 
the numbers u and v are said to be the coordinates of that 
point on the surface. By virtue of injectivity of the mapping 
(u, v) r (u, v) this definition is correct. 

Any curve 

(18) u u {t), V = V {t) 
in domain W determines a curve 

(19) r = r (w {t), V (^)) 

in g which is said to lie on the surface r = r (i^, v). Equations 
(18) are called the equations of curve (19) in coordinates u, u 
on the surface. 

In particular, defined on the surface are curves u = const 
and V = const. These are called coordinate curves and their 
collection is called the coordinate network on the surface. 
Examples of surfaces. 

1. The support of the surface 

(20) X = R cos Uj y ^ R sin w, z = v 

is a right circular cylinder of radius R. Accordingly surface 

(20) is also called a (circular) cylinder. 

When — oo << w << -f oo each point of the cylinder is 
covered an infinite (countable) number of times by the points 
of the plane {u, v). To attain injectivity it should be as¬ 
sumed that 0 < < 2ji, but then a “slotted” cylinder results. 

All our considerations being local, we shall ignore such 
situations in what follows. 

The coordinate network on a cylinder consists of “vertical” 
straight lines u = const and “horizontal” circles v = const. 

2. Let X = X (v), z = z (v) he an arbitrary regular curve 
on the plane Oxz not intersecting the axis Oz. The surface 

(21) X = X (v) cos u, y = x (v) sin u, z = z (i;) 



Lecture 24 


273 


is called a surface of revolution and the curve x = x {v), 
z = z {v) is its profile. Intuitively, surface (21) is obtained 
by rotating its profile about the axis Oz. 



A circular cylinder 



A surface of revolution 


The regularity of surface (21), i.e# linear independence of 
vectors 

= (— X {v) sin u, x {v) cos u, 0) 

Yv = {^) COS u, x' {u) sin u, z' (v)) 


is ensured by the regularity of the profile (i.e. by the condi¬ 



tion x' (v)^ + z' (v)^ ^ 0) and by the fact that the profile 
does not intersect the rotation axis Oz (i.e. by x (v) ^ 0). 

18-01325 











I 


274 Semester 2 

The coordinate network on the surface (21) consists of 
curves which are rotations of the profile about the axis Oz 

(they are called meridians) 
and circles perpendicular to 
them {parallels), 

A cylinder is 'a surface 
of revolution whose profile 
is a straight line x = 
z = V. 

A surface of revolution 
with profile x = R cos i;, 
z ~ R sin V (a circle) is the 
sphere 

X — R cos V cos u, y = 
cos V sin u, z = R sin v 

of radius R with centre at 
a point O, Coordinates u and 
V are the well-known “geo¬ 
graphical coordinates”, lon¬ 
gitude and latitude, and 
the coordinate curves are 
geographical meridians and 
parallels. 

Note that strictly speaking 
we must consider only the 
portion of the circle x == R cos v, z = R sin v that does 
not intersect the axis Oz and hence only the corresponding 
portion of the sphere (a “pole-punctured sphere”). This is 
reflected in the fact that coordinates u and v become mean¬ 
ingless at the poles. We have already agreed above, how¬ 
ever, to ignore such phenomena. 

3. A surface r = r (w, v) is said to be a ruled surface if 

(22) r {Uy v) = p {u) -f ua {u)y 

where p (u) and a (u) are arbitrary vector-valued functions 
possessing the property (ensuring regularity) that the vec¬ 
tors p' {u) + ya' (u) and a (u) are linearly independent 
for all u and v considered (so that, in particular, a (u) ^ 0 
for all u). A coordinate curve u ~ Uq = const is a straight 



A COM 



A cylinder 





Lecture 24 


275 


line, with direction vector a (uo)» passing through the point 
with radius vector p (uq)* Thus, intuitively, a ruled surface 
is swept out by a straight line moving in space. Cf. Defini¬ 
tion 1 of Lecture 23 in [IJ. 

It is clear that without loss of generality we may assume 
the vector a (u) to be a unit vector: 

{u) = 1 for all u. 

If p' (14) = 0 for all i^, i.e. p {u) = const, then, after 
translation of the origin, we obtain instead of ( 22 ) an equa¬ 
tion of the form 

(23) r = i;a (u). 

It is a cone whose directrix is a regular space curve a = a (u). 

If a' (u) = 0 for all u, i.e. a (w) = const, then surface (22) 
is a cylinder with directrix p = p (w) (a space one in gen¬ 
eral). 

If the vector p' is not identically zero, then, going if 
necessary to a smaller domain in we may assume that 
p' (u) ^ 0 for all u. Then p = p (w) is a regular curve in 
space and we may assume that u is the natural parameter 
(arc length) on that curve. Cone (23) may also be given by 
an equation of form (22) with p' (ji) 0. To do this it is 
sufficient to put p (i^) = a {u) in ( 22 ) (if a' (i^) = 7 ^ 0 of 
course). 

If a {u) is the tangent vector x (u) of a curve p = p (u), 
then surface (22) is said to be a surface of tangents. Similarly 
defined are a surface of principal normals and a surface of 
binormals. 

If a curve p = p (w) is a plane curve, then its surface of 
binormals is a cylinder over that curve. 


18 * 



Lecture 25 


Vectors tangential to a surface •The tangential plane •The 
first quadratic form of a surface* Mensuration] ofjengths^ and 
angles on a surface*0iffeomorphisms of surfaces*Isometries 
and the intrinsic geometry of d surf ace > Examples *Developables 


By analogy with Definition 5 of Lecture 22 the tangent 
vector to a (regular) surface 

(1) r==r(i^, i;), {u, u) cz 

at a point {uq, Vq) is the tangent vector of an arbitrary curve 
on a surface passing through the point {uq, Vq), Since locally 
a surface is the graph of a smooth function, this definition 
actually coincides with Definition 5 of Lecture 22 (i.e. gives 
the same vectors). According to Proposition 1 of Lecture 23 
therefore the collection of all the tangent vectors of surface (1) 
at a given point {uq, Vq) is a two-dimensional vector 
space. □ 

However, this fact is easy to prove directly as well. Indeed, 
any curve on surface (1) passing at ^ through a point 
{uq, Vq) is given as a curve in space by a vector function of 
the form 

(2) r (^) = r {u (t), V (t)), t ^ t^, 

where u = u {t) and v ^ v {t) are smooth functions such 
that u (^o) = Uq and v (^o) = Uq. Therefore 

(3) r' {t)=u' {t)r^-{-v' (t) try 
and in particular 

r' (Iq) = u' (^o) {ru)o+u' (^o) {^v)o* 





Lecture 25 


277 


Thus any tangent vector to surface (1) at a point (uq, i;o) is 
a linear combination of vectors {ru)o and (rD)o (noncollinear 
ones by the hypothesis). Conversely, if 

(4) c = a (r^)o + b (r„)o, 

where a and b are arbitrary numbers, then c = r' where 
r (^) = r {uq + a (i — ^o)» -\- b {t — ^o))» and hence c is a 

vector tangential to surface (1) at a point (wq? ^o)* 

This completes the proof, since vectors (4) constitute a two- 
dimensional vector space. □ 


Definition 1. A vector space consisting of vectors (4) 
is called the tangent plane to surface (1) at a point (uq? i^o)* 
The same term is applied also to the corresponding plane 
in space passing through a point r (uo» i^o)* The plane has a 
direction bivector (ra)oA(*’u)o Is therefore given in 

coordinates x, y, z by the equation 


X — x{Uo,Vq) y — y{Uo,Vo) Z—z{Uo,Vo) 
(Uo, Vq) yu {Uo, Vo) Zu {Uo, Vo) 
(^0» ^o) yv (^0» ^o) (^0? ^o) 



The double meaning of the term “tangential plane” is of 
course inconvenient, but no confusion will arise if care is 
taken. 

According to formula (3) vectors and form a basis of 
the tangential plane at a point (u, v). By a tradition borrowed 
from analysis the coordinates of tangent vectors relative to 
the basis are designated by the symbols du and dv, and the 
vector with those coordinates is denoted by dv. Writing 
numerical factors at the right of the vectors we therefore get 

(5) dv = Vudu + ^vdv, 

just as for numerical functions. • 

Now let 

(6) r -= Ti (i^i, i;i), (i^i, i;i) 6 

be a surface equivalent to surface (1) (see Definition 3 in 
Lecture 24) and let 

(7) Ui = Ui {u, v), Vi = Vi (u, v) 







278 


Semester '2 


be the corresponding diffeomorphism W Wi. Then 

r (m, v) = Fi {ui (u, v), Vi (u, y)) 
for any point (w, n) f and therefore 





dui 

du 


(ri)tti + 


dv-^ 

du 



Vty 





dvi 

du 



»!• 


It follows that the linear span of vectors and coincides 
with that of vectors (ri)ui and i.e. the tangential 

plane to surface (1) at a point (ui, i;i) = coincides with the 
tangential plane to surface (6) at a point = {ui {u, v), 
Ui (u, i;)) (identical as a point in space with a point (w, v)). 
In this sense the tangential planes of equivalent surfaces are 
identical, □ 

A change to equivalent surface causes in the tangential 
planes only a change from basis r^, to basis (ri)ui- 

According to formulas (8) the corresponding transition ma¬ 
trix (more exactly, the matrix of inverse transition from 
basis (^l)^^^, (ri),,^ to basis (r^, r^) has the form 



i.e. is the Jacobian matrix of diffeomorphism (7). 

In particular, it follows that the coordinates dui, dv^ of 
tangent vectors in the basis'f(^l)^i^, (ri)^,^ are related to their 
coordinates du, du in the basis by the formulas 



du,= ^du+^dv, 
== -p- 4* du 

^ du dv 


coinciding with formulas for the differentiation of formulas 
(7) known from analysis (this explains the choice of symbols 
du, dv for the coordinates of tangent vectors). 

Now note that a tangential plane being a plane in Euclid* 
ean space is itself a two-dimensional Euclidean space. It 





Lecture 25 


279 


has been customary since the time of Gauss to designate the 
metric coefficients gn, ^22 of the basis r^^, of the plane 
by the symbols F, and G. Thus by definition 

(11) E=yI, F=r^r„, G = yI. 

It should be stressed that formulas (11) define the coef¬ 
ficients E = E (u, v), F = F {u, v), G = G (u, u) as func¬ 
tions of u and v (which is not surprising since under a change 
of u, V the tangential plane is changed and so is its basis 

Definition 2. The quadratic form 

E du^ 2F du dv G dv^ 

of the coordinates du, dv of the tangent vectors relative to a 
basis Tp is called the first quadratic form of surface (1) 
and designated by the symbol /. The value of form I on the 
coordinates du, dv of a tangent vector dr (designated con¬ 
ventionally by the symbol I {dr)) is equal to the scalar 
square of that vector: 

(12) *2 = / {dv) == Edu^ 2F dudv -\-G du\ 

This means that quadratic form I is an expression in the 
basis Tu, Tjj for the quadratic functional dv ^ dv^. 

Therefore the first quadratic form Ti of the equivalent 
surface (6) is an expression for the same functional dr dv^ 
but in a different basis (ri)ui, (ri)^^ and after replacement in 
form Ii of du, dv with their expressions (10) form I is ob¬ 
tained. 

For the coefficients E^, F^, Gi of form /j this implies that 
they are related to the coefficients F, G of form I by 
the formulas 

E{u,v)^E,{u,v,)(^Y + 2FAuuV)^^ + 

+ G, (Ui, Vi) (-^) y 


( 13 ) 







280 


Semester 2 


G{u,v)^E, K, + 

+ Gi{Ui, Vi) . 

Remark. Formulas (13) can be obtained by direct calcu¬ 
lation, substituting in formulas (11) for coefficients E, F, 
and G expressions (8) for the vectors r^^ and r^. 

More loosely, under a change of coordinates on a surface 
its first quadratic form is linearly transformed with Jaco¬ 
bian matrix (9). 

In other words, the first quadratic forms of equivalent sur¬ 
faces are equivalent (at every point). □ 

For the tangent vector (3) of an arbitrary curve (2) on 
surface (1) it follows from formula (12) that for its length 
1 r' (^) I the following formula holds 

v'{t)=VWW) = 

--=yE{t) u' (f)2 + 2F (t) u' (t) v' {t) + G{t)v' (^)^ 


where 

E (t) E {u (t), V (0), F (t) = F {u (t), V (t)), 

G (t) = G {u (t), V (t)). 

But according to formula (12) of Lecture 23 the length s 
of curve (2) between the points t = a and ^ = fc is expressed 
by the formula 


h 

5= j ]r' (t)\dt. 

a 

Hence 


(14) s 


O 

\ Ye (t) u' (i)2 + 2F (0 u' {t) v' {t) + G{t)v' {t)^ dt, 

a 





Lecture 25 


281 


which may be written in the following forms, conventional 
but easier to remember: 

5 = j E du^ -\-2F du dv-\~ G dv^, 

L 

s= ^ YI (dr). 

L 


The symbol L designates curve (2) here. 

The angle between two space curves r = r (^) and r = 
= Ti (t) intersecting for a given value t of the parameter is 
the angle cp between their tangent vectors r' = r' {t) and 
r' = r' (t). Hence 


cos (p ^ 


r Fi 


|/ r'2 }/ Ti^ 


If these curves lie on surface (1), i.e. if 

r («) = r (li (0, V («)), ri («) = r {U]_ (t), (t)). 


then that formula for cos (p becomes 

(1 rp _ Eu'u^^-^-F {u'vj + v'u[) ^Gv'vj _ 

^ YEu'^-{-2Eu'v' -{-Gu'^YEu^-^2Fu[v[ + Gv[^ ‘ 

Setting 

du = u' (t) dt, du = v' (t) dt, 

6u = u[{t) dt, 6u=v[{t)dt 

we may write this formula in the following conventional 
form 

E dubu 4- F {dubv -j- dvdu) 4- G dvbv 
C S Cf) — du^ + 2Fdudv-\-Gdv^ /Ebu^-\-2F 6u6v + G 6i>2 


or in short 


cos (p — 


dr6r 


Sometimes this formula is written in the following form 
which it is convenient to remember 

I (d, 6) 

//(i)//(6) ■ 


cos (p = 









282 


Semester 2 


In particular, for the cosine of the angle between coordi¬ 
nate lines u = const and v = const we obtain the formula 


COS(p = 


F 

/E }/~G ‘ 


Hence coordinate lines u = const and v = const are orthogo¬ 
nal if and only if F — Q, □ 

Now let surface (6) be an arbitrary (regular) surface not 
equivalent to surface (1) in general and let us be given some 
mapping of the support of surface (1) into the support of 
surface (6). In the case where surfaces (1) and (6) are injec¬ 
tions (which we know is always true] locally, i.e. with W 
and Wx sufficiently small) the given mapping determines 
some mapping W and conversely any mapping W 

determines some mapping of the support of surface (1) 
into the support of surface (6). For this reason every map¬ 
ping W is called a mapping of surface (1) into sur¬ 

face (6). 

According to this definition any mapping of surface (1) 
into surface (6) is given by two functions 

(16) Ui = Ui {u, v), Vi = Ui {u, v) 

defined for (u, v) and possessing the property that 
(^1 (u, i;), Vi {u, v)) 6 Wi for any point (u, v) 6 W, 

Mapping (16) is said to be a diffeomorphism of surface (1) 
into surface (6) if it is a diffeomorphism of an open set W 
onto an open set Wi- 

It should be stressed that the nonidentity functions (16) 
may give an identity mapping of supports. It is clear that 
this occurs if and only if 

r (u, v) = ri (ui (u, v), (u, i;)) 

for any point (u, v) ^ W, i.e. (see Definition 3 in Lecture 24) 
if functions (16) give the equivalence of surfaces (1) and (6). □ 
On the other hand, whenever we are given some diffeo¬ 
morphism (16) we can go from surface (6) to an equivalent 
surface 

(17) r = Ti (ui (u, v), Vi (u, i;)) 

and then the same mapping of supports will be given by the 
identity diffeomorphism TF IF. In more customary but 



Lecture 25 


283 


less precise terms this means that (with an appropriate 
choice of coordinates on the surfaces) any diffeomorphism of 
the surfaces is a mapping defined by equating the coordinates, □ 

Definition 3. Diffeomorphism (16) of surface (1) onto sur¬ 
face (6) is said to be an isometry if at any point (u, v) the 
first quadratic form of surface (17) coincides with that of 
surface (1), i.e. if for the coefficients of the first quadratic 
forms of surfaces (1) and (6) formulas (13) hold. 

Surfaces are said to be isometric if there exists at least one 
isometry of one surface onto the other. 

To clarify this definition consider on surface (1) an arbi¬ 
trary curve L, Let as above 

u = u (t), V = V (^), a ^ t ^ b, 

be the parametric equations of that curve (as a curve on 
the surface). Every mapping (16) associates with the curve 
L a curve Li on surface (6) with parametric equations 

Ui = Ui {u (^), u (^)), Ui = Ui {u {t)j V (t)), a ^ t ^ b. 

It is obvious that the support of the curve Li is the image 
of the support of L under the mapping of the supports of 
surfaces determined by mapping (16). For this reason the 
curve Li is called the image of the curve L under mapping (16). 

On the equivalent surface (17) the curve is given by the 
same functions u = u (t), v = v (t) as the curve L is on 
surface (1). For the length Si of the curve therefore we 
have the formula 

b 

J l/^ E^u'^-\- 2F*w'i;' G^u'^ dt^ 

a 

where E*, F*, G* are the coefficients of the first quadratic 
form of surface (17). When E* = E, F* F, and G* — G, 
i.e. when diffeomorphism (16) is an isometry, this formula 
coincides with formula (14) for the length s of the curve L, 
Therefore Si = s. 

Conversely, suppose that for any curve L on surface (1) 
the length s^ of its image on surface (6) (or, what is the 
same, on the equivalent surface (17)) equals the length s 




284 


Semester 2 


of the curve L, Then, in particular, this is true for the curve 
L given by the functions 

u {t) = Uq at, V {t) = Vq + ht, 0 = i == T, 

where {uq, Vq) is an arbitrary point in W and a and b are 
arbitrary numbers (and J is a number such that {u {t), 
V (t) g W, with 0 ^ ^ J). But for these functions u' (t) = 

= a, v' (^) == b and therefore the equation s == Si takes the 
form 

T T 

[ VEa^-r2Fab + Gb^ dt = j ]/£■* + 2F* ab + G* dt 

0 0 

from which, after differentiating with respect to T and sub¬ 
stituting r = 0, it follows that 

YE{uo, Vo) -f- 2F (uq, v^) ab + G {u^, Vo) b^ = 

= Y (^o> ^o) “1" 2F* {Uq, V(^-ab -\- G* (ug, Vg) 6^. 

Since numbers a and b were chosen quite arbitrarily this is 
possible if and only if 

E (Ug, Vg) = E* {Ug, Vg) , F (Ufl, Vg) = 

= F* {Ug, Vg), G {Ug, Vg) = G* {Ug, Vg) , 

i.e. (since the point {uq, Vq) was an arbitrary point in W) if 
E = E*, F = F*, and G = G* in W and hence if diffeo- 
morphism (16) is an isometry. 

This proves that a diffeomorphism of surfaces is an isometry 
if and only if it preserves the lengths of curves, i.e, for any 
curve L on surface (1) its image Lj on surface (6) has the same 
length. □ 

On imagining a surface made of flexible but inextensible 
material and bending it arbitrarily we shall not change the 
lengths of curves on it and hence an isometric surface will 
result. On the basis of this intuitive idea the founders of the 
theory of surfaces called isometries bendings in the 19th 
century. This terminology has partly survived to this day, 
but now it is usual to understand bendings in a narrower 
sense, as isometries to be related to an identity transfor¬ 
mation by a continuous family of isometries. All mathema- 



Lecture iS 


285 


ticians have been certain for a long time that in the local 
situation, i.e. in a sufficiently small neighbourhood of an 
arbitrary point, any isometry is a bending in that sense. 
Comparatively recently, however, Professor N. V. Yefimov, 
of Moscow University, has shown this to be false by con¬ 
structing an appropriate counterexample. 

Preserving lengths under isometries is a consequence of 
the fact that in formula (14) for the length of a curve only 
the coefficients of the first quadratic form I appear (besides 
the functions defining the curve). But formula (15) for the 
angle between curves also possesses this property. Therefore 
angles are also preserved under isometries. 

It is convenient to give the name of the intrinsic geometry 
of a surface to the collection of all concepts and statements 
remaining unchanged under isometries. Thus the concepts of 
length and angle belong to intrinsic geometry. 

It is clear that intrinsic geometry comprises every notion 
that, like lengths and angles, may be defined using the first 
quadratic form alone. 

By definition, two surfaces have the same intrinsic geometry 
{are isometric) when their first quadratic forms can be made 
identical by changing coordinates. This test is of course quite 
ineffective. Therefore our immediate aim is to make it more 
effective. We shall deal with this in our next lecture, and 
now we shall consider a number of examples illustrating 
calculation of the first quadratic form of surfaces. 

Example 1. A plane Oxy has in coordinates u = x and 
V = y Si parametric equation r ^ ui + v]. Therefore = 
= i, Fp = j and hence E — I, F = 0, G = l, i.e. for the 
plane 

(18) I du^ + dv^. 

(A result easy to foretell without any calculations). 

Example 2. For the circular cylinder 

T = R cos u-i R sin u j + i^*k 

we have = —R sin u*i + i? cos u^j and = k. There¬ 

fore 


E=rl='R^, F=rarj,=0, G = yI=1, 





286 


Semester 2 


i.e. for the cylinder^ 

By introducing a new coordinate Ui = Ru (and again denot¬ 
ing Ui by u) we transform this form to the form (13). Hence 
a cylinder is isometric with a plane. 

Intuitively this fact is obvious: to bend a cylinder into a 
plane it is sufficient to cut it along its generator. 

Example 3. For the surface of revolution 

V = X {v) cos u-\ X (y) sin w- j -f- z (i;) k 


we have 


—X (y) sin -\-x{v) cos u-j, 

Tp X (t?) COS u • 1 -|— X (i?) sin • j -|— 2' (i;)k. 


Hence 

E = X (v)^ sin^ u + X {vY cos^ u = x (i;)^, 

F = —X (y) sin u-x' (v) cos u + x (u) cos u-x' {u) sin u ==0 
G = x' (i;)^ cos^ u x' {vY sin^ u z' (i;)^ = x' {vY + 

+ z'l;^. 


so that for the surface of revolution 

I = X {uY du^ -f {x^ (i;)^ -f z' (i;)^) dv^. 

It is intuitively obvious that the meridians and parallels 
of any surface of revolution are orthogonal. The equation 
F = 0 could therefore be foreseen without any calculations 
as well. 

In the case where the profile a: = x (i;), z = z (i;) of a sur¬ 
face of revolution is referred to the natural parameter v = s 
(and therefore x' (i;)^ + z' (y)^ = 1) form I takes an espe¬ 
cially simple form: 

I = X (uY du^ -f du^. 

In particular we see that the first quadratic form of a 
sphere (of radius 1) is of the form 

(19) I = cos^ V du^ -f- dv^. 


Lecture 25 


287 


Cartographic experience shows that no portion of a sphere 
however small can be bent into a plane. This means that no 
transformation of coordinates can convert form (19) into 
form (18). But how is this to be proved? The answer will be 
given in our last lecture. 

Example 4. The deflection line of a heavy homogeneous 
thread is called a catenary {curve) and a surface of revolution 
whose profile is a catenary is called a catenoid. 






In mechanics (statics) it is shown that a catenary is the 
graph of a hyperbolic cosine. Thus for a catenoid x (V) = 
= ch i;, z (v) = V and hence 

X (v)^ = ch^ V and x' (v)^ + z' (v)^ = sh^ i; + 1 = ch^ v. 

Thus for the catenoid 

(20) I = ch^v (du^ + dv^). 

Example 5. Let a straight line perpendicular to the axis Oz 
rotate uniformly near it while remaining perpendicular to it 
and simultaneously ascending in helical motion (to a height 
proportional to the angle of rotation). The ruled surface 
swept out by that straight line is called a helicoid. It has 
the form of a helical ramp for cars to drive up. 















288 


Semester 2 


If V is the parameter on the straight line and u is the angle 
of rotation, then the helicoid will have the equation 

T = u COS u^i + V sin u • j + uk. 

Therefore 


Fii = —u sin u-i + u cos j + k, 

== cos + sin u*j, 

and hence 

E = 1 +v\ F = 0, G = 1. 

Thus for a helicoid 

/ = (1 + i;^) du'^ + dv^. 

Let us transform this form by introducing new coordinates 
Ui, Vi related to the coordinates u, v hy the formulas 

u = Ui, z; = sh Vi, 

Then 

1 + = 1 -h sh^ ch^ v^, 

du = du^, dv —chv^dv^, 

and therefore (we drop the indices in the new coordinates) 

I = ch^ V {du^ + dv^), 

which coincides with form (20). 

This proves that the catenoid and the helicoid are isometric 
(only locally of course), there existing an isometry trans¬ 
forming meridians of the catenoid into rectilinear generators 
of the helicoid. 

An astonishing result! 

Example 6. For an arbitrary ruled surface 

(21) r = p (z^) + {u), 

where (see the preceding lecture) p = p (w) is a regular curve 
referred to the natural parameter and a (u) is a vector func¬ 
tion such that 1 a (z/) 1 = 1 for all u, denoting differen¬ 
tiation with respect to u with a dot, we have 

• • 

ru = P + i^a, ro = a. 



Lechire 25 


289 


Since p® = 1, and aa = 0 and a* = 1, we have 
£ = 1 + 2i;pa + 1 ;%^^ F = pa, G — \. 

If in particular a = p (a surface of tangents), then pa = 

• • • 

= a^ = 1 (i.e. F = i) and pa = 0 and a^ = where k is 
the curvature of the curve p = p (w) (i.e. E = (1 + k^v^)). 
Thus for a surface of tangents 

(22) / = (1 + k^}^) du^ + 2du dv + dv^. 

But if a {u) is the binormal vector of the curve p = p (w), 

• • • • 

then pa = 0, pa = 0 and a^ = where x is the torsion of 
the curve p = p (w). Hence for a surface of binormals 

/ = (1 + x^y^) du^ + dv^. 

We thus see that the first quadratic form of a surface 
of tangents depends only on the curvature of a given curve 
and that the first quadratic form of a surface of binormals 
depends only on the torsion of the surface. 

For surfaces of tangents this implies that every surface of 
tangents is isometric with a plane (locally). Indeed, consider 
a plane curve with the same curvature k = k{u) (such a 
curve exists by virtue of Theorem 1 of Lecture 24). The first 
quadratic form of the surface of tangents of that curve is the 
same form (22). But, on the other hand, it is clear that a 
surface of tangents of a plane curve is (locally) a plane. 
There exists therefore a change of coordinates transforming 
the first quadratic form dx^ + dy^ of the plane into form 
(22). (This change of coordinates has the form 

X = X (u) + X (u) u, y = y (u) + y' (u) v, 
where x (u) and y (u) are functions such that x' {uY + 

+ x" {uY + z/" {uY = k {uY-) □ 

This isometry can be carried out by continuous bending, 
gradually deforming the curve p = p (zi) into a plane curve. 

For this reason surfaces of tangents are called developable 
surfaces (or developables) (development into a plane is 
meant). 

19-01325 









2d0 


Semester i 


If a (i^) = p (zi), surface (21) is a cone with vertex at the 
origin (and the curve p = p (w) is the intersection of the 
cone with a unit sphere | p | = 1). In this case we have 


pa=:p2=l, a2 = l, pa = 0, 
so that form I becomes 

/ = (1 + v)^ du^ + dv^. 

Here the change of coordinates (i^, v) (u, 1 + v) sug¬ 
gests itself, converting the last form into a slightly simpler 
form 

(23) / = du^ -f dv^. 

Now let us introduce new coordinates 


Then 

and hence 


X = V cos u, y = V sin u. 

dx = — V sin u du cos u dv, 
dy = V cos u du + sin u dVy 

dx^ -f dy^ — du^ -f dv^. 


This proves that any cone is isometric with a plane. For 
this reason cones are also reckoned among developables. 

Note that form (23) is nothing but the first quadratic form 
of a plane referred to polar coordinates r = v and (p = u. 

Finally, if the vector a (u) is constant (and therefore 
a = 0), surface (21) is a cylinder. We may consider without 
loss of generality that the directrix p = p (w) of the cylin¬ 
der is a plane curve whose plane is orthogonal to the vector 

a (and hence pa = 0 and pa = 0). Therefore, as with the 
circular cylinder (Example 2), 

I = du^ -f dv^. 

For this reason all cylinders are also reckoned among deve¬ 
lopables. 

In the next lecture we shall show that among ruled sur¬ 
faces only developables (i.e. cylinders, cones, and surfaces of 
tangents) are isometric with a plane. Moreover, it turns out 
that developables exhaust all the surfaces isometric with a 
plane. We shall leave this fact without proof. 




Lecture 26 


The tangential plane and the normal vector*The curvature of 
a normal section*The second quadratic form of a surface*The 
indicatrix of Dupin*Principal curvatures* The second quadra¬ 
tic form of a graph*Ruled surfaces of zero curvature*Surfaces 
of revolution 


We proceed to consider an arbitrary regular surface 

(1) r = r (u, v), {Uj v) 

in a three-dimensional Euclidean space 
Recall (see thejpreceding lecture) that the tangential plane 
at a point {u,\v)loi surface (1) is a plane in space passing 
through a point with radius vector r {uj v) and having the 
direction bivector ru/\r^. If the space g is oriented, then 
for any point (w, v) of the surface a unit vector n == n (w, v) 
is defined perpendicular to the tangential plane and con¬ 
stituting, together with vectors r^ and r^, a positively orien¬ 
ted basis 

(2) r„, n 

of the space ^ (more precisely, of its associated vector 
space f^). That basis is called the normal vector Xo surface (1) at 
a point (w, v). Basis (2) is called the moving basis of the sur¬ 
face at the point (u^ v). 

It should be stressed that the moving basis is not orthonor¬ 
mal in general. 

The vector n is of course collinear with the vector X r^. 
Hence 


_ TuXTjy 
ITuXToI 




19* 










292 


^emesfer 2 


Lemma 1. For any two vectors a, h of a three-dimensional 
oriented Euclidean vector space ‘T we have 


!a X b| 


a^ ab 
ab b2 • 


Proof, Let i, j, k be a positively oriented orthonormal ba¬ 
sis of a vector space T* such that 


a = ai, 

b = &'i + fej, 


Then a X b = a&k and 


a^ = a\ ab = ab\ b^ = 6'^ + b\ 


Therefore | a X b | = and 


a^ ab 
ab b^ 


a^ ab' 5 

ab' + =a2(fe'* + 6=‘)-W = a^&^. □ 


Remark. In any Euclidean space a theory of volumes can 
be developed quite similar to an elementary theory of areas 



and volumes in three-dimensional space. Then Lemma 1 
will turn out to be a special case of the general proposition 
stating that for any vectors ai, . . a^^ of an arbitrary 

Euclidean vector space TT the square of the m-dimensional vol- 




Lecture 26 


293 


ume of a parallelepiped constructed on those vectors is equal to 
the determinant 

This is called the Gramian of vectors ai, . . It is zero 

if and only if these vectors are linearly dependent. If m = 
= dim f" and the vectors ai, . . a,^ are linearly indepen¬ 

dent (constitute a basis), the elements of the Gramian are 
nothing but the metric coefficients of that basis. 

On applying Lemma 1 to the vectors and r^, we at once 
see that 


|r„Xr„12 = 




E F 
F G 


= EG-F^, 


and hence that 


n = 


ruXr^, 

yTEG — F^ 


It is by this formula that the vector n is usually computed. 


Let to be an arbitrary unit 
vector which is the tangent 
vector of surface (1) at a 
point (i^o* ^o)* Consider a 
plane passing through a 
point with radius vector 
r {uq’> having a direc¬ 

tion bivector toA*^o» where 
no — n (wo, ^o)* If is in¬ 
tuitively obvious that the 
plane intersects the surface 
in some curve having at 
the point {uq, i;o) the tangent 
vector to (and hence regu¬ 



lar). This curve is called the 

normal section of surface (1) determined by the tangent vec¬ 


tor to* 


Let rectangular coordinates x, y, z be chosen in a space ^ 
§0 that surface (1) in the neighbourhood of a point (i^o? ^o) 





294 


Semester 2 


is the graph of a smooth function z = z {x, y), with no being 
the coordinate unit vector k. Then, if to = ai + the 
normal section determined by the vector to till obviously 
have (as a curve on the surface) equations 

u Uq aty V = Vq ht 

(in space this curve would have equations x Uq aty 
y = Vq + bty z = z{uq + aty Vq + bt)). 

This not only provides a method of writing the equations 
of a normal section, but also allows its formal definition 
(not based on intuition) as a curve on the surface with equa¬ 
tions u — Uq + aty V = Vq + bt (provided of course sur¬ 
face (1) is represented as the graph of the smooth function 
z = z (Xy y)). It is certainly required here to verify the cor¬ 
rectness of this definition, which is in principle not hard to 
do. We shall not deal with this, however, since the notion of 
normal section will play in our discussion only an auxiliary 
and mainly heuristic role. 

Let w = w (5), i; = i; (s) be the equations (on the surface) 
of the normal section of surface (1) at a point (uq, 1^0)» de¬ 
termined by the tangent vector to* Suppose that s is the 
natural parameter of a space curve r = r (5), where r (5) = 
= r (zi (5), u (s))y with u ( 0 ) — Uq, v (0) = Vo- Then for the 
tangent vector t = t (5) of the normal section we have 

• • • 
t = r = ruM + rpi;, 

with t (0) = to- Hence 
• •• •• •• •• 

t = r„w + r„w 4- r^v + r^i; = 

• .. • *• w* •• 

= (ruu u + ruvV)u + (rvuU + r„„v) v + r„u h r„v = 

• •• • •••• 
= run {uY + 2r„„ {uv) + {vf 4 + t„v . 

Putting here s = 0 and multiplying by no we get 

(3) t (0) no = ((r„„)o no) u (O)^ 4* 

+ 2 ((r„„)o no)u (0) v ( 0 ) + ((r„„)ono) v (0)2, 

for (r„)ono = 0 and (rB)ono = 0. 




Lecture 26 


295 


Now note that by definition a normal section is a plane 
curve. In the plane of the curve the vectors to, Uq determine 
some orientation and with respect to that orientation the 
normal section will have at each of its points relative curva¬ 
ture krei (see Lecture 22). At the point s = 0 the curvature 

is obviously equal to the scalar product t (0) Hq we have 
just computed and is therefore expressed by formula (3). 

To simplify the formulas we shall now drop the index zero 
everywhere, i.e. denote the vector to by t, the point (i^o? 
by (w, v) and so on. The relative curvature (at the point 
5 = 0) of the normal section at a point (u, i;), determined by 
the vector t, will be denoted by k (t). Besides, we set 

M = Tuvn = —r^n„ = — 

TV TjjjjH rpHi, 

(since r^n = 0, we have -f = 0 and -f 

+ = 0 and since r^n = 0, we have r^^^n -f = 0 

and -f = 0). In this notation formula (3) takes 
the form 

(5) k (t) = Lu^ -f- 2Muv + iVi;2, 

• • 

where u and v are the coordinates of the vector t in the basis 

Fu, r„: 


• • 

t = r^w4-r^i;. 


Formula (5) may now be taken as a formal definition of a 
function t & (t), and all said above regarded as merely an 
informal motivation of the definition. 

It is convenient to extend the function t A: (t) construct¬ 
ed now to include all possible nonzero tangent vectors 
dt = Vu du Vj^dv assuming by definition that 









296 


Semester 2 


(recall that ds = | dr |; see above). Since the coordinates of 
the unit vector ^ are the numbers ~ and ^, we have by 

(JL^ CL^ do 

formula (5) 

+ + " (t)'= 

_ L du^-\-2M du dv-\-N dv^ 

““ d^ • 


Since 


ds^ = Edv? + 2F dudv + G du\ 


it follows that 

7 i» /^ \ ^ du^ —j— 2]\^ du dv —}“ N dv^ 

E aui_^.2F dudv + G dv^ * 

Definition 1. The quadratic form 

L du^ + 2M du dv N dv^ 

is called the second quadratic form of surface (1). It is desig¬ 
nated by the symbol II, 

Introducing the vector 

(7) dn = Tiudu + n^dz; 

form II can be identified (by virtue of (4)) with the scalar 
product —dr dn. 

Formula (6) can now be written in the following form con¬ 
venient to remember: 


or, using vector (7), in the form 


- 


dr dn 


In the literature symbols Z), Z)', D" are also used to de¬ 
signate the coefficients L, M, N oi form //. 

To visualize the function t fc (t) the French mathemati¬ 
cian Dupin suggested that on the tangent plane the curve 
(now called the indicatrix of Dupin) should be considered 
that results if for any unit vector t a segment of length 




Lecture 26 


297 


I k (t) 1“^/^ is marked off from the point of tangency (taken 
as the origin 0 on the tangent plane) in the direction of that 
vector. Denote by x and y the coordinates (in the coordinate 
system of the terminal point of the segment; then its 

length is expressible (in clear notation) by formula 

I1 = 1// (x, y). 


Since the curvature k (t) can be expressed by formula (6), 
which in the present notation has the form 



II (x, y) 

I (x, y) 


7 


we obtain for the indicatrix of Dupin the equation 


i.e. the equation 

I II y) I = 1- 

This proves that the indicatrix of Dupin is a curve with 
equation 

\Lx^-{- 2Mxy ~\-Ny^\ — 1. 


When LN — > 0 the curve (more precisely, the set of 

its real points which is our only concern) is an ellipse with 
equation 

(8) Lx^ + 2Mxy + Ny^ = e, 


where e = +1 if Z/ > 0 and e = —1 if Z/<; 0. Accordingly 
a point of surface (1) at which LN — > 0 is called 

elliptical. 

At an elliptical point all curvatures k (t) have the same 
sign (coinciding with that of L). Among them, there is one 
maximum ki and one minimum /cg (unless they all coincide, 
i.e. unless the indicatrix of Dupin is a circle) corresponding 
to the directions of the minor and major axes of ellipse (8). 

When LN — < 0 the indicatrix of Dupifi consists of 

two hyperbolas 

(9) Lx^ + 2Mxy + Nf ±i 











298 


Semester 2 


with common asymptotes and therefore a point of surface (1) 
at which LN — < 0 is called hyperbolic. In the direc¬ 

tion of the real axis of one of the hyperbolas (9) the curvature 
k (t) attains its maximum value > 0. As the vector t is 
rotated the curvature first decreases to zero, when the vector 
t assumes asymptotic direction, and then, continuing to 
decrease, attains its minimum value k^ < 0, when the di¬ 
rection of the vector t coincides with that of the real axis 



At an elliptical point At a hyperbolic point At a parabolic point 

The indicatrix of Dupin 


of the other hyperbola (i.e. with the direction of the imagi¬ 
nary axis of the first hyperbola). 

When LN — = 0 a point of surface (1) is called 

parabolic. At such a point the indicatrix of Dupin has the 
equation 

( 10 ) {Y\L\x+VW\yY = '^ 


and therefore is a pair of parallel lines (provided L ^ 0 
or N ^ 0). In the direction of these lines the curvature 
k (t) is equal to zero, in the perpendicular direction it 
reaches its maximum (in magnitude) maintaining throughout 
the same sign. But if L = 0, W = 0 (and therefore M = 0), 
the curvature k (t) is identically as a function of t equal to 
zero (and the indicatrix of Dupin is not defined). 

Note that at elliptical and parabolic points the indicatrix 
of Dupin is a second degree curve, and at hyperbolic points 


it is a quartic curve. 

In each of the three cases the function k (t) twice attains 
its maximum k^ and its minimum k^ (unless it is identically 
e(|ual to zero). 





Lecture 26 


299 


Definition 2. Numbers and k^ are called the principal 
curvatures of surface (1) at the point under consideration. 
Their product 

K. — k-^k^ 

is called the total (or Gaussian) curvature and their half-sum 

JJ — 

is termed mean curvature. 

According to the above said, > 0 at an elliptical point, 
iT < 0 at a hyperbolic point, and iT = 0 at a parabolic 
point. 

To find principal curvatures one could seek the principal 
directions of the second degree curves (8) and (9) (there is 
no problem with curve (10)) and then find their canonical 
equations. Unfortunately, this method involves lengthy 
computations because the coordinates x and y are not 
rectangular. Therefore we shall proceed in a different way, 
applying directly to the basic formula (6). 

According to this formula curvature k^ is the smallest 
value of the function 

II {Xj k) _ Lx^-\-2Mxy-\- Ny^ 

I (x, y) ~ Ex^ + 2Fxy + Gy^ 


of two variables x and y, with {x, y) ^ (0, 0). Hence 


II y) 
I (xy y) 



2 


for all (x, y) ^ (0, 0), equality holding at least at one point 
(x, y). Since I {x, y) > 0 when {x^ y) ^ (0, 0), this in¬ 
equality is the same as the inequality 


II {x, y) — kj (x, y) > 0 


implying that the quadratic form II — k^I with matrix 

IL^k^E M-k^F\ 

\M~fe 2 F N — k^G) 

is nonnegative at all points {x, y) ^ (0, 0) and zero at least 
at one of them. 















300 


Semester 2 


Similarly, the number is characterized by the fact that 
the quadratic form II — kj is everywhere nonpositive and 
zero at least at one point {x, y) ^ (0, 0). 

But it is easy to see (directly or on the basis of the general 
theory of quadratic forms over the field "R; see Lecture 12) 
that a quadratic form in two variables is everywhere nonposi¬ 
tive or nonnegative and zero at least atone point {x^ y) ^ (0, 0) 
if and only if its rank is less than two, i,e, if the determinant 
of its matrix is zero. □ 

This proves that the principal curvatures k^, k^ are the 
roots of the equation 


L-kE M-kF 
M-kF N — kG 



i.e. of the equation 

{EG - F^) k^ - {EN + GL — 2FM) k + (LN -- 

— M^) = 0 . 


In particular it follows (by virtue of Viete’s formulas) 
that 



LN — M^ 
EG — F^ ’ 


1 EN-\-GL — 2FM 

2 EG — F^ 


The first of these formulas will find an important applica¬ 
tion in our next lecture. 

Suppose that coordinates x^ y, z in a space % have been 
chosen so that the surface under consideration is the graph 
of a function z = z {x, y), with z (0, 0) = 0, and the normal 
vector at the point (0, 0) is the unit vector k of the axis Oz, 
It is easy to see that the last assumption is the same as the 

assumption that == 0 and = 0* Hence expansion 

of the function z {x, y) into a Taylor series begins with 
quadratic terms: 

z = rx^ 2sxy + ty^ + . . 


where 




Ldckire i6 


301 


Since in this case r = ui -j- vj z (u, i;) k, we have = 

~ ^ ”1“ ^itk, “ 3 ”1“ ^ijk and ^uv “ 

= 2 pyk. Hence at the point (0, 0) we have L = r, M = Sy 
N = ty i.e. in the case under consideration the second qua¬ 
dratic form coincides with the sum {Xy y) of quadratic terms 
in the Taylor series of the function z (xy y). □ 

Since near the point (0, 0) the surface z = z {Xy y) differs 
but little from the surface z = z^ {Xy y) and since for rt — 
— 5^ > 0 the latter surface is an elliptical paraboloid and 
for rt — 5^ < 0 it is a hyperbolic paraboloid, this proves that 
an arbitrary surface differs but little from the elliptical para¬ 
boloid near an elliptical point and from the hyperbolic para¬ 
boloid near a hyperbolic point, □ 

This gives a quite satisfactory idea of the behaviour of 
the surface near nonparabolic points. 

As to the behaviour of the surface near a parabolic point 
nothing definite can be said about it; it may be very complex 
in general. 

For the ruled surface 
( 11 ) T = p (u) + va. {u)y 

as we already know, 

E = i-\-2upa-]-v^a^y F = pa, G^l 

(we as ever assume that the parameter u on the curve p 
= p (u) is natural and the vector a (zi) is a unit vector. Fur¬ 
ther 

• • 

ru=P + f^a» r!, = a, 

• • 

Tu X r„ = p X a + 1 ; (a X a), 

pXa + D (axa) 

Y EG — F^ 

• • • • • 

p + ya, r„B = a, rBp = 0, 

• •••• • •• 

T _ (p+i>a)(pxa+i>(axa)) paa M = 0 

YEG—P^ ’ / EG—F^ ' ' 

(paa)2 
EG — F^ ’ 



LN-^M^ = 








30 :^ 


Semester i 


and therefore 



(paa)2 



Thus the total curvature of an arbitrary ruled surface is non- 

positive^ i.e, a ruled surface has no elliptical points, □ 

• • 

When the surface is a cylinder (a = 0), a cone (a = p and 

• • • 

therefore a = p) or a surface of tangents (a = p), the for¬ 
mula obtained yields K = 0, Thus the total curvature of 

every developable is equal to zero. 

• • • • 

Conversely, if = 0, then paa = 0, i.e. the vectors p, a, a 

are coplanar. If the vector a (u) is not identically zero, i.e. 
if surface (11) is not a cylinder, then, passing, if necessary, 

to a smaller neighbourhood, we may assume that a (u) ^ 0 

• 

for all u. The vectors a and a are therefore linearly indepen¬ 
dent (they are nonzero and orthogonal) and hence the vector 

p is linearly expressible in terms of them: 

• • 
p = A,a + [xa. 


where A, = A, (ix), fx = [x (ix) are some functions of u. 
Let 


Ui = u, Vi = V \i (u). 


Since the Jacobian of this transformation is equal to 1, the 
numbers Ui and Vi are also, after possibly passing to a smal¬ 
ler neighbourhood, coordinates on surface (11), i.e., to be 
more exact, they determine an equivalent surface. The 
equation of that surface is of the form 

r = Pi (ui) + Via (ui). 


where 

Pi (u) = p (li) — p (u) a (w). 






L«c(ure 26 


• • 

If Pj = 0 identically (i.e. X = fx), then the equation of 

surface (11) is of the form 

r = const + Via (ui) 

and therefore that surface is a cone. Otherwise we may as¬ 
sume, diminishing if necessary the neighbourhood, that 

• 

Pj (li) 0 for all u. Passing then to the natural parameter 




f 



A developable surface of tangents 

(and changing if necessary the sign of i;i) we see that Pi = a, 
i.e. that the surface under consideration is a surface of 
tangents. 

Thus we have proved the following proposition: 
Proposition 1. A ruled surface has zero total curvature, 

K = 0, 

if and only if it is a developable, □ 

We have also established that developables are character¬ 
ized by the condition 

• • • 

paa = 0 

which is easily seen to be equivalent to the collinearity of the 

• • 

vectors p X a and a X a. But the collinearity of these ve- 









304 


Semesfer 1 


ctors is equivalent to the fact that the vector 

• • 

X Fp = p X a + 1 ; (a X a) 


is, up to proportionality, independent of i;, i.e. independent 
of V is the corresponding 
unit vector n. This proves 
that developables can be dis¬ 
tinguished among all the 
ruled surfaces by the property 
that at all the points of each 
rectilinear generator such a 
surface has the same tangen¬ 
tial plane. □ 

For an arbitrary surface 
of revolution 

r = X (y) cos w*i + 

+ X (y) sin u-\ z {v)k 

we have 

— X (y) sin u-\-\- 

4-x(i;) cosi^-j, 

Tjy = x' (i;) cosw-i-|- 

+ a:'(y) sini^*j+ z'(y) k 

and hence E = x (v)^, F — 0^ G = 1 (we assume that 
(i;)^ + — 1; see Lecture 25). Therefore 

X Fp =x{v) z' (v) COS -f X (v) z' (y) sin y-j — a:(y) x' (v) k, 

n = z' (y) cos y*i + 2 ' (y) sinu-j — x' (y) k, 

ruu= —cosy*i — x(i;) sinu-j, 

Fup = — x' (v) sin ii*i + a:' (y) cos y • j, 

Tpp = x" (y) cos w• i + x" (y) sin u-}-{- z" {v)k; 



A pseudosphere 


L = r„„n = — a; (i;) z'(i;), M = r 

JV = a:" (v) z' (v) — z" (v) x' (v) = - 

LJV—M»_2'(v) x '( v ) 
EG-F* ~ x(v) x''{v) 


urn = 0, 

x' (v) 
xf (v) 
z’ (v) 
z''{v) • 


z'{v) 

z-'iv) 


9 







Lecture 26 


305 


This proves that for a surface of revolution 

^ / (i) z' (v) 

xi”) x''{v) z" {v) ■ 

Example 1. For a sphere of radius R we have 
x{v) = R cos , z{v) = R sin , 

and therefore 

x' {v)= —sin-^, z'(i;) = cos 

^cos-^, z" (y) = —^ sin, 

„ Z'jv) 3;'(v) z'(v) 1 

^ (^) x" (u) z" (v) 

Thus the total curvature of a sphere of radius R is constant and 
equal to 1/i?^. □ 

The result is intuitively obvious. 

The following example is more interesting. 

Example 2. A surface of revolution with profile 

x{v) = R sin Uy z{v) = R ^ In tan -f cos z; j , 0 < < -y 

(it is the so-called tractrix) is termed a pseudosphere. For 
this surface 

f / \ T~% f / \ ^ ID * COS^ V 

x' (v) = R cos Vy z (v) = —:- R sin v = R —:- 

^ ' » \ / gjjj JJ gin y 

and hence 

x' (u)^ -f- 2 ' (v)^ = R^ cot^ V. 

Since x' (v)^ -f z {pY = l,the general formula obtained 
above is not applicable directly and it is necessary to first 
pass to the natural parameter of the profile. 

We have 





cot V dv=^ — i? In sin v 


20 -01325 





306 


Semester 2 


and hence 



Thus in terms of the natural parameter (which is again de¬ 
noted by v) the tractrix will be given by the functions 


x{v)=^Re 

z(y)^ ^ i — e , 


We calculate: 

V 

x' {u)= —e ^ 




Thus 


x' (v) z' (v) 
x" (v) z” (i;) 



z' (v) z'(v) 1 

® (^) x’ (v) z" {v) ~ -R* ' 



SO that the total curvature of a pseudosphere is constant and 

1 

equal to —^ . □ 

We see that in regard to total curvature the pseudosphere 
differs from the sphere only in the sense of curvature. This 
accounts for t^ e term “pseudosphere”. 




I 


Lecture 26 


307 


Example 3. For the catenoid 

X {v) = ch y, z {v) = V, 
x (y) = sh V, t! (y) = 1, 

X ipY + z' (v)^ = ch^ i;, 

and therefore we must again pass to the natural parameter 


V 

s= j ch y dy = sh y. 


Again denoting this parameter by v we obtain the functions 

X (y) = ]/1 4- z (y) = In (y + 1 + v^)- 

Therefore 

x' (v) — ,. 1— , g' (y) = ^ , 

— (1 +1;2)3/* ’ ^ = a + y*)3/2 ’ 

a;'(y) z'(y)_ 1 

x'Cy) g"(y) “ l + ’ 

and hence 

V _ f 

(l + i;a)2 • 

It is interesting to compare the curvature of the catenoid 
with that of the helicoid isometric with it. 

For the helicoid we have equation (11) with 

p {u) = uk, a {u) = cos u-i + sin u-j. 

Therefore 

• • 

p = k, a= —sinw-i + cosi^-j, 

E = \-\- 2i;pa + = 1 -|~ 

F = pa = 0, G = 1, 

EG—F2=l + i;2, 

0 0 1 

• • 

paa = cos u sin u 1 = 1, 

— sinu cos w 0 


r 







308 


Semester 2 


and hence 



1 


We have obtained the same result as that for the catenoid! 
This means that the total curvatures coincide at the corres' 
ponding points when the catenoid is bent into the helicoid, □ 
What happens to the mean curvature? 

For the catenoid = 1 + F = 0, G = 1. In addition 


L=—X (v) z'(u) =—iy 


N= - 


and therefore 


x' (u) z' (v) 
x"{v) 2 "(i;) 


1 

l+y2 » 


EN + GL-- 2FM = 0, 


i.e. 





Thus the mean curvature of the catenoid is equal to zero, □ 
For the helicoid, on the other hand, 


p X a = sin u-i — cos u«j, a X a = —k, 

• • • 

p = 0, a — —cos u*i — sin u-j, 


(p + i;a) (p X a + i; (a X a)) = 0 

and in addition, as we have already seen, 

F-0, G = l, 

— paa = l. 


Therefore 


L-0, M = 




, N = 0, 


and hence 


i.e. 


EN + LG — 2FM = 0, 


H = 0. 




Lecture 26 


309 


Thus the mean curvature of the helicoid is also equal to zero.C 

The example of catenoid and helicoid suggests that total 
and mean curvatures are preserved under bending (iso¬ 
metry). It turns out that this hypothesis is true for total curva¬ 
ture (and we shall show this in our next lecture) whereas 
for mean curvature it is false. Indeed, for a plane the mean 
curvature is equal to zero while for a circular cylinder of 
radius R developable into a plane it is obviously equal to 
l/2i?. 

The reasons why the catenoid and helicoid have turned 
out to have equal mean curvatures are deep and interesting 
but we are deprived of the possibility of discussing them 
here. 





Lecture 27 


Weingarten's derivation formulas •Coefficients of connection* 
The Gauss theorem •The necessary and sufficient conditions of 
isometry 


For the moving basis r^, n of an arbitrary surface 
(1) r = r (u, v) 

formulas can be written, similar to Frenet’s formulas for 
curves, that yield an expansion of the derivatives 

^UD> 

of the vectors of the moving basis with respect to that same 
basis. 

Since = 1 and hence = 0 and nn^ = 0, the vectors 
n„ and are expanded only with respect to the vectors 
Tu and Tj,, so that 

Hu = + Pr*,, 

= ocir^ + Pir„. 

Mul iply ng the first of these formulas by and we obtain 
two relations: 

— L r^n^, = ar^ + pr^r^ = aE + pF, 

— M = r^n^, == ar^r^ + Pr? = aF + PG, 

from which it follows that 

FM — GL ,, FL — EM 

> 


a == 


EG-F^ 


EG — F^ • 



Lecture 27 


311 


Similarly calculated are the coefficients of the second formula: 

FN—GM o FM—EN 

^1— eG~F^ ’ EG — F^ • 

Further, since by definition 

= L, = M, = N 

and since by the hypothesis = 0, r^n = 0, the coeffi¬ 
cients of n in the expansions of the vectors Tuu, ^uvj 
with respect to the basis r^, n are equal to L, M, N 
respectively. 

We thus have 

ruu= rjjr„ + r^jr„ + Ln, 

Tud = rjjru + r^ 2 '’D + Mn, 

(2) r„„ = r‘2r„ + r|2r„ + Afn, 

_ fM — GL _ FL — EM 
“ EG — F^ EG—F^ 

- FN—GM FM — EN 

gQ__pt !■«+ gQ_p2 ’’»> 


where Fij, i, j, A: = 1, 2, are some functions of u and v. 
Formerly these functions were designated by the symbols 


{VI 


and called Christoffel symbols. But now they are usually 
called connection coefficients. 

Formulas (2) are called Weingarten'*s] derivation formulas. 


To compute connection coefficients r?| we first find the 
six products of vectors r^p by vectors and Tp. 

Since tu = E, we have = Eu and 2r^pr^ = £'p, i.e. 


^uu^u 2 and r^pF^j 2 E^j, 


Similarly, since r| = G, we have 


1 1 

and rpprp=-^Gp. 















312 


Semester 2 


Besides, since = F, we have + ^u^uv = stnd 

^uv^v + ^u^vD = Pv^ from which it follows that 

•) 1 1 
^UU^V Pv ^Itd ^vV^U~Pd 

Now multiplying the first three of the formulas (2) by r^^ 
and Ty we obtain six relations: 

f Eri,+Fri=±E^, 

\ Er\, + Gr,, = Fu-^E„, 

I Eri, + FGl = j-E,, 

I Fri,+Gri,==-^Gu, 
j Eri, + Fri,=^F,-j-G^, 

\ Fri, + GTl=4-G„, 

from which it is easy to find the coefficients Tij. 

(The equations are uniquely solvable since the determi¬ 
nant EG — of every pair of equations is nonzero.) 

We see that the connection coefficients Ti; can be expressed 
in terms of the coefficients of the first quadratic form and of 
their derivatives. Hence they remain unaffected under bend¬ 
ings {isometries) of a surface. □ 

We shall not need explicit expressions for coefficients 

rfj in terms of the coefficients of the first quadratic form, 
and so we shall not write them out. 

The coefficients of derivation formulas are connected by 
three relations resulting from calculating partial derivatives 
^uuv> ^uvvi in two different ways using these formu¬ 

las. One of these relations was found by Gauss and the other 
two by Peterson, Codazzi and Mainardi. We shall consider 
only Gauss’ relation which we shall obtain by calculating 
the coefficient of in the expansion of the partial derivative 
^auv with respect to the vectors r^^, r^, and n. 


Lecture 27 


313 


In this calculation we shall only follow the coefficient 
of and only those of its terms which depend on the coef¬ 
ficients of the second quadratic form. All the other terms 
will be replaced by dots. 

We have 


^UUV - {^uu)v — 

= . . . -j- -f~ . . . 4” • • • ”f” -^Up 



FM — EN 
EG — F^ 


• . . ”(- Zy ^ . . . -f- 

+ • • • ) Tp + • • • 


FM — EN 
EG — F^ 



Similarly 


uuv 



+ r^grp + Mn)u = 



FL — EM 
EG — F^ 


. . . I Tp —f- . . . 


Hence 


FM — EN _ FL — EM 
EG — F^ EG — F^ 


7 


where dots denote terms depending only on the coefficient 
of the first quadratic form. But 


M 


FL — EM 
EG — F^ 


FM — EN 
EG — F^ 



LN — M^ 
EG — F^ 


= EK. 


Since E 0 (form I is positive definite), this proves that 
the total curvature K of a surface is expressible in terms of the 
coefficients of the first quadratic form (and of their derivatives). 
It follows that the curvature K remains unaffected under 
bendings. This result deserves to be distinguished as a 
theorem. 

Theorem 1 (the Gauss theorem). The total {Gaussian) cur¬ 
vature of a surface remains unaffected under bendings {iso¬ 
metries), i.e, isometric surfaces have the same curvature at 
points corresponding to each other. □ 

Gauss was so delighted with the theorem that he called it 
theorema egregium, which means a “brilliant theorem” in 
Latin. From Theorem 1 it follows in particular that no arbit- 
















314 


Semester 2 


' rarily small part of a sphere can be bent into a plane. There¬ 
fore no map gives an absolutely faithful representation of 
the Earth’s surface. 

An explicit expression for curvature K in terms of the 
coefficients and G of the first quadratic form is 

E E, E, 

(3) K— 4 (EG — F^)^ E Fu Fjj 

G Gu G, 

_ 1 _ f / Ey P' u \ / P'V Gn \ 'i 

2 y EG — F^ Y EG —\ ]/ EG — 

The other two relations, obtained from differentiating the 
derivation formulas (and usually called the Peterson-Co- 
dazzi formulas) are of the form 

2 {EG - F^) {L, - M^) ^ 

E L 

~(£A-fGL~2fM)(£,~F,)+ F Fu M =0, 

G Gu N 

(4) 

2{EG-F^) {M.^NJ- 

E E^ L 

-{EN + GL-2FM){Fu — Gu)+ F F^ M - 0. 

G Gu N 

To prove these formulas all one needs is patience and 
carefulness. 


The Gauss theorem states that the equality of total cur¬ 
vatures is a necessary condition for the isometry of two sur¬ 
faces. At the same time, although this condition is by no 
means sufficient, it is so strong that using it sufficient con¬ 
ditions can be obtained without difficulty. We shall not 
expound this question and only consider the most impor¬ 
tant special case of the corresponding theorem. 

Let 

^ EKl-2FKuKu + GKl 



Lecture 27 


315 


(It is Beltrami’s first differential parameter of ihe function 
K calculated in “curvilinear” coordinates u nd u). If the 
two functions K and of u and v are functionally inde¬ 
pendent, i.e. if their Jacobian 


dK 

dK 

du 

dv 


dAj_K 

du 

dv 


is nonzero, then they may be taken as new local coordinates 
on surface (1). We call these coordinates Gaussian coordi¬ 
nates. A direct calculation shows that any diffeomorphism of 
a surface preserving the function K (in particular any iso¬ 
metry) leaves the function Ai/f invariant too. In particular 
every isometry is therefore a mapping defined by equating 
Gaussian coordinates. This means that the following theorem 
is true. 

Theorem 2. Two surfaces which have Gaussian coordinates 
defined on them are isometric if and only if in these coordinates 
their first quadratic forms coincide, □ 

Thus, to determine whether or not two surfaces are iso¬ 
metric it is necessary to introduce (if possible) Gaussian 
coordinates and calculate in these coordinates the first 
quadratic forms of the surfaces. If the forms coincide, the 
surfaces are isometric, but if they are different, the surfaces 
are not isometric. 

Theorem 2 gives no answer when K and are function¬ 
ally dependent, for example when Ai/iT = 0 (which occurs, 
as can be easily figured out, if and only \i K = const). In 
this extreme case it can be shown, however, that the condi¬ 
tion of Theorem 1 proves to be sufficient, i.e. two surfaces 
of constant total curvature are isometric if and only if they 
have the same curvature. In other words, any surface of con¬ 
stant total curvature K is isometric with a sphere of raditCs 
k 

R = —= if K > 0, with a plane if K = 0, and with a pseu- 
V ^ 

dosphere with parameter R = 'y~=^ << 0. The proof 

consists in constructing explicitly coordinates u, v in which 
the first quadratic form coincides with the first quadratic 
form of a sphere, a plane, and a pseudosphere respectively. 
Unfortunately we have no time to spare. 









SUBJECT INDEX 


Alternation, 67 

Anti-isomorphism of algebras, 
147 

Basis conjugate to a basis, 38 
Beltrami’s first differential pa¬ 
rameter, 242 

Catenary, 287 
Catenoid, 287 
Christoff el brackets, 311 
Coefficients of connection, 311 
Complement, orthogonal, 181 
Cone, fe-fold, 128 
Coordinates, contravariant, 189 
covariant, 189 
Plucker, 97 
Covector, 36 

Curvature, radius of, 257 
Curve, catenary, 287 
curvature of, 255, 258 
curvatures of, 265 
in a Euclidean (affine) space, 
243 

mean curvature of, 299 
nonparametric, 246 
of the general type, 257, 258, 
263 

regular, 245 


relative curvature of, 251 
smooth, 244 
Cylinder, 272 
&-fold, 128 

Darboux’s differential parameter, 
242 

Developables, 289 
Diffeomorphism, 270 

Eigenvalue, 151 
algebraic multiplicity of, 153 
geometrical multiplicity of, 151 
Eigenvalues, minimax property 
of, 204 

Eigenvector, 151 

Factor space of a vector modulo 
a subspace, 32 
Form (s), normal, 116 
quadratic, 112 
law of inertia for, 122 
positive definite, 124 
of a surface, first, 279 
second, 296 

Fredholm alternative, 60 
Frenet’s formulas, 257, 259, 265 
Frenet’s moving basis, 257, 258> 
264 




Subject index 


317 


Function, gradient of, 223 
graph of, 224 
harmonic, 241 
Functional, bilinear, 48 
matrix of, 49 
matrix rank of, 108 
mixed, 52 
linear, 36 

tensor product of, 50 
multilinear, 62 
degree of, 62 
rank of, 65 
rank space of, 62 
quadratic, 110 
positive definite, 124 
sesquilinear, 179 
skew-symmetric, 73 

essential coefficients of, 82 
symmetric, 106 

Gauss theorem, 313 
Grassman algebra, 77 
Grassman manifold, 105 
Group, symmetric, 66 
unitary, 213 

Hamilton-Cayley theorem, 171 
Hamilton’s symbolic vector field, 
236 

Helicoid, 287 
Helix, circular, 259 
Hyperplane, 98 
tangential, 253 

Hypersurface, of an affine space# 
135 

central, 133 

noncentral second degree, 133 
oval second degree, 135 
of a projective space, 129 
second degree, 134, 137 
singular point of, 228 
smooth (regular), 223 


Index, inertial, 123 
negative, 123 
positive, 123 
lowering of, 189 
raising of, 189 
Indicatrix of Dupin, 29fi 

Jacobi theorem, 121 
Jordan cell, 167 
Jordan normal form, 167 

Kronecker delta symbol, 42 
Kronecker-Capelli theorem, 25 

Lagrange algorithm, 114 
Lagrange theorem, 114 
Length of a continuous curve, 
253 

Manifold, linear, 99 
Mapping, linear, 33 
Matrix, of the coefficients, 25 
augmented, 25 
positive definite, 124 
rank of, 20 
unitriangular, 118 
Minor, 20 

Minor, principal, 119 
Minor determinant, 20 
Multivector, of degree p, 86 

Net, coordinate, 156 
Nilpotency of a (matrix) ope¬ 
rator, degree of, 158 

Operator, Laplacian, 241 
linear, 140 
adjoint, 191 

complexification of, 171 
cyclic, 159 
diagonalizable, 156 
Hermitian, 192 










318 


Subject index 


Operator, linear (cont.) 
identity, 141 
invertible, 144 
isometric, 212 
matrix of, 143 
nilpotent, 158 
nonnegative, 209 
nonsingular, 144 
normal, 205 
orthogonal, 212 
orthogonally diagonaliz- 
able, 198 
positive, 209 
restriction of, 149 
scalar, 142 
self-adjoint, 193 
singular, 144 
skew-Hermitian, 193 
skew-symmetric, 193 
spectrum of, 157 
symmetrical, 192 
unitary, 217 
zero, 141 

Pairing, 39 

Paraboloid, elliptical, 137 
hyperbolic, 137 
Permutation, 66 
even, 66 
odd, 66 

Peterson-Codazzi formulas, 314 
Plane, normal, 263 
osculating, 263 

of a projective space, r-di- 
mensional, 102 

parametric vector equation of, 
99 

p-dimensional, 98 
rectifying, 263 
Pliicker relations, 95 
Point of self-intersection, 249 


Polynomial, characteristic, 153 
in an operator, 168 
Product, external, 75 
Pseudosphere, 305 

Rank, external, 88 
Root subspace, 163 
Root vector, 163 
Rotation 215 
generalized, 215 

Set, annulet of, 44 
linear span of, 12 
rank of, 17 
span of, 12 

Simple spectrum linear operator, 
157 

Space, pseudo-Euclidean, 113 
conjugate to a vector space, 36 
Spaces, direct sum of, 35 
Straight line, 98 
Submatrix, principal, 119 
Subspace, 11 

belonging to an eigenvalue, 
151 

direction p-vector of, 97 
invariant, 149 
trivial, 12 
zero, 12 

Subspaces, complementary, 30 
direct sum of, 28 
sum of, 14 

Surface, elliptical point of, 29 
hyperbolic point of, 298 
of binormals, 275 
of principal normals, 275 
of tangents, 275 
parabolic point of, 298 
principal curvatures of, 290 
regular, 270 
ruled, 274 




Subject index 


(Surface cont.) 
support of, 270 
total curvature of, 299 
Surfaces, developable, 289 
isometric, 283 
isometry of, 283 
Sylvester’s criterion, 125 
System of solutions, fundamen¬ 
tal, 45 

Tangent vector, 245, 250 
Tensor, coefficients of, 55 
contraction of, 60 
(p, q)-, 54 

Tensor transformation law, 56 
Torsion, 259 
Trace, 61 
Tractrix, 305 
Transformation, affine, 
centroaffine, 215 
orthogonal, 214 
unitary, 214 
Translation, 215 
parallel, 215 

Vector, binormal, 258 

of the principal normal to a 
curve, 258 

to a surface, normal, 291 


319 

Vectors, congruent modulo a 
subspace, 31 
tensor product of, 50 
Vector field, 227 
divergence of, 234 
gradient, 230 
irrotational, 230 
potential, 230 
rotation of, 231 
singular point of, 228 
Vector potential, 234 
Vector space, 11 

complexification of, 170 
tensor algebra of, 59 
Vector spaces, coimage of a ho¬ 
momorphism of, 34 
dual, 39 

epimorphism of, 33 
homomorphism of, 33 
image of a homomorphism of 
33 

kernel of a homomorphism of, 
33 

monomorphism of, 33 
morphism of, 33 

Weingarten’s derivation formu¬ 
las, 311 










TO THE READER 


Mir Publishers would be grateful for your comments 
on the content, translation and design of this book. 

We would also be pleased to receive any other sug 
gestions you may wish to make. 

Our address is: 

Mir Publishers 
2 Pervy Rizhsky Pereulok 
MIO, GSP, Moscow, 129820 
USSR 

y^K 516-20 


Mnxanji MHxafijiOBHH IIocthhkob 


JIEKItHH no rEOMETPHH 
CeMecTp 2 

JIHHEnHAn AJIFEBPA H AHOOEPEHItHAJIbHAR PEOMETPHH 

HayHHbiii pejiaKTop >K. H. CycjiOBHH. PejiaKTop P. B. Jtn^eHKO. 
XyflOJKHHK H. B. SoTOBa. XyAO>KecTBeHHbiii pe^aKTop H. B. Botobb. 
TexHHHccKHft pesaKTop F. B. Ajucjinna. KoppeKTopbi O. Jl. BmjiHMaH, 

G. H. KaniTaHOBa. 

HB Ke 2968 


Ca^ho b HaOop 25.03.81. noAnncano k nenaxH 22.10.81. OopMax 84X108V32. 
By Mara THnorpa(|)CKaH 1. FapHHxypa oObiKHOBennaH. HenaTb BbicoKaa. 
06T>eM 10 6yM. ji. Vcji. ncH. Ji. 16,80. ycji. ap.-OTT. 17,17. yn.-HSA. Ji. 15,45. 
Hba. iNs 17/1242. TnpajK 7440 9K3. 3aK. 01325. IljeHa 1 p. 59 k. 

HBAATEJIBGTBO «MHP». MocKBa, 1-ft PnjKCKHft nep., 2. 

OpAena TpyAOBoro Kpacnoro SHaMenn 
MocKOBcnaa THnorpa(|)Ha Nq 7 «HcKpa peBOJiioi]iHH» CoiosnojiHrpa^npoMa 
FocyAapcTBeHHoro KoMHTexa GGGP no AejiaM HBAarejibCTB, nojiiirpa^nn 
H KHHJKHoft ToproBjiH. MocKBa 103001, TpcxnpyAHbift nep., 9. 

1702040000 


20203—019 
056(01)—82 


19—82, q. 


2 


Printed in the Union of Soviet Socialists Republics 


















M. Postnikov received his doctor's 
degree (Phys.-Mafh.) in 1954, from 
1955 to 1960 was Professor of the 
Higher Algebra Department 
(Moscow State University) 
and since 1965 he has been 
Professor of the University's 
Department of Higher Geometry 
and Topology. 

He is a senior research worker at 
the Stekiov Institute of Mathematics. 
In 1961 Prof. Postnikov was awarded 
Lenin Prize for a series of works on 
the Homotopy Theory of Continuous 
Maps. He is the author of several 
monographs and textbooks some 
of which have been translated into 
foreign languages. 




ABOUT THE PUBLISHERS 
Mir Publishers of Moscow publish Soviet 
scientific and technical literature 
in eleven languages—English, German, 
French, Italian, Spanish, Czech, 



Serbo-Croat, Slovak, Hungarian, 

Mongolian, and Arabic. Titles include 
textbooks for higher technical schools 
and vocational schools, literature on 
the natural sciences and medicine, 
including textbooks for medical schools, 
popular science and science fiction. 

The contributors to Mir Publishers' 
list are leading Soviet scientists and 
engineers in all fields of science and 
technology and include more than 
40 Members and Corresponding Members 
of the USSR Academy of Sciences. 

Skilled translators provide a high 
standard of translation from the 
original Russian. 

Many of the titles already issued by 
Mir Publishers have been adopted 
as textbooks and manuals at educational 
establishments in France, Cuba, Egypt, 
India, and many other countries. 

Mir Publishers' books in foreign 
languages are exported by 
V/O "Mezhdunarodnaya Kniga" and can 
be purchased or ordered through 
booksellers in your country dealing 
with V/O "Mezhdunarodnaya Kniga". 






